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ABSTRACT 


A  principal  component  analysis  with  varimax  rotation 
of  the  principal  factors  was  performed  for  watershed,  storm, 
and  runoff  data,  from  five  central  and  eastern  Montana  water¬ 
sheds.  The  analyses  provided  information  about  the  relative 
importance  of  29  independent  variables  to  the  peak  discharge 
rates  and  runoff  volumes  produced  by  these  variables. 

Storm  intensity,  standard  deviation  of  storm  inten¬ 
sities,  soil  and  air  temperature,  watershed  azimuth,  over¬ 
land  slope,  watershed  shape,  reservoir  area,  and  watershed 
area  were  among  the  most  successively  important  variables. 
Correlations  among  some  of  the  variables  for  the  research 
watersheds  were  also  indicated  by  the  analyses.  Principal 
component  and  rotated-factor  regression  equations  for  the 
runoff  variables  were  developed,  and  are  suggested  as  pre¬ 
diction  equations  for  ungaged  watersheds  in  central  and 
eastern  Montana. 
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Chapter  I 
INTRODUCTION 

In  order  to  make  reasonable  predictions  of  peak  dis¬ 
charge  rates  and  total  runoff  volumes  on  small  watersheds, 
an  understanding  of  the  factors  causing  the  runoff  is  need¬ 
ed.  If  the  important  watershed  and  storm  variables  could 
be  properly  identified  and  measured,  the  relationships  be¬ 
tween  these  variables  and  the  runoff  could  then  be  readily 
determined. 

HISTORICAL  BACKGROUND 

Many  attempts  have  been  made  to  relate  peak  discharge 
rates  and  total  runoff  volumes  from  a  watershed  to  their 
causative  factors.  The  relationships  which  have  been  pro¬ 
posed  usually  take  the  form  of  graphs  or  equations  relating 
the  rates  and  volumes  to  factors  that  are  believed  to  be 
important.  The  general  trend  in  these  methods  is  to  choose 
the  ’’important”  factors,  obtain  measurements  of  each,  and 
then  relate  the  factors  to  the  rates  and  volumes  produced. 
However,  the  choice  of  factors  is  usually  made  from  experi¬ 
ence  or  judgment,  and  the  proper  or  improper  choice  of  fac¬ 
tors  leads  to  accurate  or  inaccurate  results.  A  comparison 
of  the  methods  indicates  that  there  is  considerable  con¬ 
fusion  as  to  which  factors  are  to  be  used,  and  v/hich 
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factors  ars  more  important  than  others.  After  analyzing 
several  methods  of  discharge  rate  and  runoff  volume  pre¬ 
diction,  Sharp  and  Biswas  (1965)  wrote:  "Exhaustive  analy 
sis  of  research  data  from  small  watersheds  not  only  fail¬ 
ed  to  reveal  how  various  factors  function  in  producing 
runoff,  but  failed  even  to  reveal  the  parameters  that 
should  be  used  to  estimate  runoff."  The  wide  range  of 
factors  used  in  prediction  methods  seems  to  substantiate 
this.  Some  factors  appear  in  more  of  the  methods  than  do 
other  factors,  but  none  of  the  methods  agree  on  which  set 
of  factors  are  the  most  important. 

PURPOSE  AND  METHODS 

The  study  reported  herein  is  an  attempt  to  determine 
for  five  central  and  eastern  Montana  watersheds,  which  of 
29  factors  are  more  important  in  producing  peak  discharge 
rates  and  total  runoff  volumes,  and  the  regression  equa¬ 
tions  relating  peak  discharge  rates  and  runoff  volumes  to 
these  factors.  Multivariate  statistical  analyses  are  used 
to  investigate  the  relative  importance  of  each  of  the  inde 
pendent  variables  to  the  two  dependent  variables,  peak  dis 
charge  rate  and  total  runoff  volume.  As  explained  in  a 
later  chapter,  a  principal  component  analysis  of  the  cor¬ 
relation  matrix  of  the  independent  variables  is  performed, 
and  this  is  followed  by  a  varimax  rotation  of  the  factor 
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weight  matrix  to  determine  which  variables  are  most  im¬ 
portant.  Regression  equations  for  the  dependent  vari¬ 
ables  are  found  from  the  important  variables,  using  the 
uncorrelated  principal  components  and  the  rotated  factor 
weight  matrix. 

Although  only  a  few  investigators  have  applied  multi¬ 
variate  methods  to  hydrologic  data,  several  have  recommend¬ 
ed  their  use  in  this  field.  The  advantages  which  are  pre¬ 
sented  in  a  later  chapter  indicate  that  they  provide  a 
significant  improvement  over  some  of  the  other  methods 
used.  The  methods  are  employed  here  because  they  are  well 
suited  to  the  large  volumes  of  data  that  have  been  gene¬ 
rated  on  the  watersheds  being  investigated.  Because  of  the 
large  amounts  of  data  involved,  multivariate  analyses  were 
not  practical  until  the  advent  of  the  electronic,  digital 
computer.  Computer  techniques  were  used  extensively  in  the 
data  reduction  and  analyses  for  this  study. 

SCOPE 

Data  for  this  study  was  taken  from  five  small  central 
and  eastern  Montana  wa ter sheds  currently  being  studied  for 
the  Drainage  Correlation  Research  Project,  by  the  Depart¬ 
ment  of  Civil  Engineering  and  Engineering  Mechanics  at 
Montana  State  University.  The  Drainage  Correlation  Re¬ 
search  Project,  which  was  initiated  in  1963,  is  sponsored 
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by  the  Montana  State  Highway  Commission  and  the  Bureau  of 
Public  Roads,  and  is  an  investigation  of  the  frequency  of 
peak  discharge  rates  for  small  watersheds  in  Montana. 

Data  collection  is  expected  to  continue  until  September, 
1969.  Fifty  runoff  events  on  the  five  project  watersheds, 
having  peak  discharge  rates  greater  than  10  cfs,  and  oc¬ 
curring  between  April,  1964,  and  September,  1967,  were 
studied  for  the  investigation  reported  herein. 


DEFINITIONS 


Because  certain  terms  and  phrases  are  frequently  used 

in  this  paper,  several  definitions  are  presented  here. 

The  definitions  are  those  commonly  used  in  the  literature, 

and  some  are  more  thoroughly  discussed  in  later  chapters. 

Dependent  variable  -  the  variable  to  be  predicted  from 
measurements  of  the  independent  variable (s)  in  a  re¬ 
gression  equation  (e.g.,  peak  discharge  rate). 

Independent  variables  -  the  variables  on  which  meas¬ 
urements  are  obtained  and  substituted  into  the  re¬ 
gression  equation  to  calculate  the  prediction  of  the 
dependent  variable  (e.g.,  precipitation  intensity, 
watershed  area,  etc.) 

Regression  equation  -  an  equation  for  the  dependent 
variable,  derived  from  several  measurements  of  this 
variable  and  the  independent  variable,  or  variables, 
in  a  manner  which  indicates  the  relationship  of  the 
independent  variables  to  the  dependent  variable.  (If 
more  than  one  independent  variable  is  involved,  the 
equation  is  usually  termed  a  ’’multiple  regression 
equation” ) 

Multiple  linear  regression  equation  -  a  multiple  re¬ 
gression  equation  in  which  the  dependent  variable  is 
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related  to  a  sum  of  the  independent  variables,  with 
each  of  the  independent  variables  being  multiplied 
by  a  different  coefficient. 

Coefficient  of  a  variable  -  a  constant  to  be  multi¬ 
plied  by  the  measurement  of  a  variable  in  a  regres¬ 
sion  equation,  component,  or  factor. 

Multivariate  studies  -  methods  of  studying  more  than 
two  variables  when  the  measurements  of  the  variables 
are  obtained  simultaneously  in  time  or  space. 

Linear  correlation  of  two  variables  -  a  statistical 
measure  of  the  closeness  to  a  straight  line  of  the 
graphical  plot  of  measurements  of  both  variables. 

Components  -  Principal  Components  -  Normalized  Eigen¬ 
vectors  -  derived,  uncorrelated,  independent  variables 
written  as  the  sums  of  the  original  independent  vari¬ 
ables,  if  each  is  multiplied  by  a  coefficient. 

Factor  -  a  component  whose  coefficients  on  the  inde¬ 
pendent  variables  have  been  multiplied  by  a  constant. 
The  squared  coefficients  within  a  factor  total  to  the 
square  of  the  constant. 

Rotated  factor  -  a  factor  whose  large  coefficients  on 
the  independent  variables  have  been  maximized. 

Variate  -  a  component,  factor,  or  rotated  factor. 
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Chapter  II 
LITERATURE  REVIEW 

In  recent  years  a  great  many  investigators  have  ap¬ 
proached  the  problem  of  peak  discharge  frequencies  from 
small  watersheds,  and  have  applied  a  great  variety  of 
techniques  in  analyzing  the  problem.  Many  of  the  investi¬ 
gators  have  used  statistical  methods,  and  several  have 
applied  multivariate  analyses  to  hydrologic  data.  Reports 
of  a  large  number  of  these  investigations  were  reviewed  in 
the  course  of  the  study  reported  herein.  To  discuss  all 
the  literature  which  was  reviewed  would  result  in  an  ex¬ 
tremely  voluminous  and  unwieldy  report.  It  would  seem  more 
appropriate,  therefore,  to  confine  the  review  herein  to  the 
results  of  a  few  of  the  more  relevant  studies  which  bear 
directly  on  the  present  investigation. 

INVESTIGATIONS  WITH  MULTIVARIATE  ANALYSES 

Prior  to  1950,  multivariate  methods  were  well  estab¬ 
lished,  but  were  not  practical  in  hydrology  because  of  the 
time-consuming  computations.  However,  as  the  high-speed 
computer  became  more  accessible  to  investigators,  the  meth¬ 
ods  were  recogiiized  as  a  possible  means  of  studying  the 
peak  discharge  rate  and  runoff  volume  prediction  problem. 
For  example,  Wong  (1963)  used  multivariate  methods  to 
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analyze  data  from  90  basins  in  New  England.  The  basins 
ranged  in  area  from  10  to  2000  square  miles,  and  measure¬ 
ments  of  eleven  independent  variables  were  taken  on  each 
of  these  basins  to  determine  a  regression  equation  for  the 
mean  annual  flood  with  a  recurrence  interval  of  2.33  years. 

Wong  found  that  a  previous  "ordinary”  multiple  re¬ 
gression  on  five  of  the  eleven  independent  variables  was 
not  satisfactory  because  the  variables,  average  land  slope, 
mean  altitude,  tributary  channel  slope,  stream  density, 
and  shape  of  basin  were,  according  to  Horton’s  Laws,  "multi - 
collinear ,  ”  meaning  that  some  were  linearly  related  to 
others  and  should  not  be  included.  After  obtaining  meas¬ 
urements  on  six  additional  variables,  drainage  area,  main 
channel  slope,  tributary  channel  slope,  percentage  of  area 
in  ponds  and  lakes,  length  of  longest  watercourse,  and  pre¬ 
cipitation  intensity;  a  principal  component  analysis  and 
varimax  rotation  were  performed  on  the  data.  This  resulted 
in  two  new,  unrelated  variables  or  "components.”  Both  of 
these  components  were  linear  functions  of  all  the  meas¬ 
ured  variables,  but  the  new  components  were  not  linearly 
related  to  each  other.  Also,  the  first  component  was 
found  to  be  signif icantly  more  important  than  the  second, 
and  each  component  was  found  to  be  more  highly  associated 
with  certain  variables.  The  first  had  large  coefficients 
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on  variables  which  expressed  the  area  and  length  of  the 
drainage  basin,  and  the  second  was  associated  with  the  slope 
and  topography  of  the  drainage  basin.  Both  components  were 
about  equally  associated  with  mean  annual  flood,  indicating 
that  both  were  important  to  the  dependent  variable. 

Because  the  two  components  indicated  that  size  and 
length  were  not  related  to  slope  and  topography,  Wong  de¬ 
cided  that  these  two  parameters  would  form  a  good  set  of 
independent  variables  for  a  multiple  regression  equation. 

He  examined  the  correlations  of  mean  annual  flood  with  the 
components  and  the  eleven  independent  variables,  and  found 
that  the  length  of  the  main  stream,  L,  was  highly  corre¬ 
lated  with  both  the  mean  annual  flood  and  the  first  com¬ 
ponent;  and  that  the  average  land  slope,  S,  was  similarly 
related  to  the  second  component  and  mean  annual  flood. 

These  two  variables  were  consequently  chosen  for  a  multi¬ 
ple  regression,  giving: 

Log  Q2.33  =  -1.02  +  1.29  Log  L  +  0,97  Log  3 
for  the  regression  equation.  This  equation  had  a  coefficient 
of  determination  of  0.80,  which  meant  that  80  per  cent  of 
the  variation  in  the  mean  annual  flood  could  be  explained 
by  only  two  variables  instead  of  eleven.  The  previous 
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regression  on  five  independent  variables  had  the  same  coef¬ 
ficient  of  de termination,  but  involved  linearly  related 
variables.  The  study,  therefore,  reduced  the  number  of 
"independent"  variables  needed  to  explain  the  same  vari¬ 
ation  in  mean  annual  flood  for  New  England. 

Eiselstein  (19 67)  performed  a  similar  analysis,  al¬ 
though  he  was  interested  in  runoff  volume  instead  of  mean 
annual  flood  rate.  A  350-a-cre  watershed  was  divided  into 
17  runoff  plots  on  which  data  from  30  variables  was  obtain¬ 
ed  over  a  period  of  four  years.  The  variables  were  grouped 
into  five  categories,  storm  variables,  antecedent  moisture 
variables,  site  variables,  soil  description  variables,  and 
the  dependent  variable,  runoff  in  inches  from  each  plot. 

Because  Eiselstein  was  a'ware  of  the  inadequacy  of 
ordinary  multiple  regression  to  provide  a  good  prediction 
equation  when  the  independent  variables  are  not  truly  in¬ 
dependent,  he  performed  three  separate  analyses  to  show 
the  different  results  that  can  be  obtained.  An  ordinary 
linear  regression  analysis  gave  an  equation  in  terms  of 
all  29  "independent"  variables,  and  13  regression  coef¬ 
ficients  were  found  to  be  statistically  significant,  i.e., 
non-zero.  This  equation  accounted  for  77  per  cent  of 
the  variation  in  the  runoff  volume,  but  Eiselstein  was 
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not  satisfied  with  the  results  because  the  non-signifi¬ 
cant  variables  had  high  coefficients  of  correlation  with 
each  other  and  with  the  significant  variables.  Also,  the 
test  for  significance  was  not  valid  because  correlated 
variables  were  used.  This  meant  that  the  ” signif icant’' 
variables  were  a  combined  measure  of  several  variables,  and 
the  ’’non-signif icant”  variables  could  not  honestly  be  dis¬ 
carded. 

To  attempt  to  separate  the  combined  effects  of 
several  variables  to  the  effect  of  each,  a  principal  com¬ 
ponent  analysis  of  the  correlation  coefficients  of  all 
combinations  of  the  independent  variables  was  performed. 
This  resulted  in  29  new,  independent  variables,  or  ,f com¬ 
ponents,”  which  were  each  linear  functions  of  all  the  29 
original  ’’independent”  variables.  The  first  of  these 
components  had  high  coefficients  in  the  rainfall  variables, 
but  the  remaining  components  could  not  be  readily  as¬ 
sociated  with  specific  variables.  This  is  the  reason 
for  the  varimax  rotation,  which  rotates  the  components  to 
another  set  of  reference  axes  so  that  only  high  or  low 
coefficients  exist  on  the  original  variables,  and  a 
better  interpretation  can  be  made. 

Before  rotating  the  ’’principal  components,”  Eiselstein 
computed  values  for  each  component  by  using  the  original 
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variable  data  to  solve  the  linear  equations.  This  re¬ 
sulted  in  a  numerical  value  of  each  component  for  each 
runoff  event.  Because  the  components  were  truly  independent, 
a  multiple  regression  on  these  values  was  performed.  After 
the  regression  coefficient  for  each  component  was  found,  18 
of  the  29  coefficients  exhibited  significance.  Also,  the 
regression  equation  explained  77  per  cent  of  the  variation, 
which  was  exactly  the  same  amount  explained  by  the  ordinary 
regression  equation.  This  analysis  gave  a  good  prediction 
equation  because  truly  independent  variables  were  used. 
However,  because  each  significant  component  was  a  linear 
function  of  all  29  original  variables,  nothing  could  be 
stated  about  the  importance  of  the  original  variables  at 
this  point  in  the  analysis. 

To  further  investigate  the  separate  variables, 
Eiselstein's  third  analysis  consisted  of  a  multiple  re¬ 
gression  on  rotated  components  instead  of  the  original 
components.  An  initial,  orthogonal  variamax  rotation  of 
the  original  components  yielded  a  set  of  components  which 
all  had  low  coefficients  on  12  of  the  original  variables. 
Because  these  variables  were  not  significant  to  any  of  the 
rotated  components,  they  were  deemed  unimportant  to  the 
runoff  and  discarded.  Also,  only  the  first  seven  com¬ 
ponents  were  deemed  to  be  important  because  the  rest  each 
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contributed  less  than  one  per  cent  to  the  variation  of  the 
dependent  variable.  This  resulted  in  seven  linear  equations 
for  17  of  the  original  variables,  giving  essentially  the 
same  information  as  the  first  29  components  and  variables. 

A  second  varimax  rotation  of  the  seven  components 
gave  seven  new  components  which  could  be  interpreted  in 
terms  of  the  17  remaining  independent  variables.  One  of 
the  components  accounted  for  51  per  cent  of  the  variation 
in  runoff,  and  had  high  coefficients  on  precipitation  in¬ 
tensity  and  total  precipitation.  The  second  most  impor¬ 
tant  component  accounted  for  5  per  cent  of  the  variation 
in  runoff  and  had  high  coefficients  on  slope,  elevation, 
and  "aspect.”  The  third  component,  accounting  for  four 
per  cent  of  the  variation  in  runoff,  had  high  coefficients 
on  surface  soil  properties.  The  other  components  each 
explained  only  a  small  portion  of  the  variation  in  runoff, 
and  could  not  be  associated  with  specific  variables.  The 
important  interpretation  from  the  components  was  that  the 
rainfall  variables  were  much  more  important  to  runoff  than 
the  aspect  and  soil  properties,  because  they  accounted  for 
51  per  cent  of  the  variation  in  runoff.  Also,  the  variables 
within  each  component  v/ere  linearly  dependent  and  their 
combined  effect  was  independent  of  the  combined  effect  of 
the  variables  of  other  components.  This  meant  that  a 
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multiple  regression  on  the  components,  or  on  certain  vari¬ 
ables  from  each  component,  would  be  a  regression  on  truly 
independent  variables.  (Wong  had  made  this  same  observa¬ 
tion  in  1963  and  had  written  his  final  regression  equation 
in  terms  of  two  nearly  independent  variables,  one  from 
each  important  rotated  component.  Other  variables  were  sig¬ 
nificant  to  the  components,  but  the  two  he  chose  were 
combined  measures  of  those  in  each  component.) 

Eiselstein  chose  to  find  a  multiple  regression  equa¬ 
tion  for  all  17  of  the  variables  in  the  seven  components, 
rather  than  select  one  variable  from  each  component  to  re¬ 
present  the  total  component.  After  substituting  data  from 
the  original  variables  into  the  component  equations,  a  multi¬ 
ple  regression  of  the  values  computed  gave  an  equation 
for  runoff  in  terms  of  the  components,  and  hence  in  terms 
of  the  17  independent  variables,  because  the  components 
were  linear  functions  of  the  variables.  The  coefficients 
of  this  final  equation  seemed  to  be  realistic  because  no 
obvious  fallacies  in  the  signs  of  the  coefficients  could 
be  detected.  For  example,  the  rainfall  characteristics 
were  directly  and  not  inversely  related  to  runoff  as  is 
the  case  in  some  ordinary  multiple  regression  equations 
(Sharp,  Gibbs,  Owen,  and  Harris,  I960;  Wallis,  1965). 

The  final  coefficient  of  determination  was  0.67  indicating 
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a  10  per  cent  loss  of  information  as  a  result  of  reducing 
29  variables  to  17,  and  29  components  to  7. 

Snyder  (1961)  performed  a  principal  component  regression 
analysis  relating  total  December  runoff  from  a  watershed  to 
the  rainfall  in  October,  November,  and  December.  A  previous, 
ordinary  multiple  regression  equation  had  negative  coeffi¬ 
cients  on  all  the  independent  variables  indicating  that  run¬ 
off  increased  as  the  monthly  rainfall  decreased.  Because 
this  was  "intuitively”  inaccurate,  Snyder  obtained  three 
independent  components  from  a  principal  component  analysis, 
and  derived  a  regression  equation  for  these  components,  and 
hence  for  the  variables,  and  found  that  all  coefficients 
were  positive.  A  reduction  in  the  coefficients  of  deter¬ 
mination  from  0.83  to  0.75  for  the  principal  components 
solution  resulted,  but  Snyder  felt  that  the  loss  in  infor¬ 
mation  about  the  variation  in  runoff  was  justified  by  the 
intuitively  correct  coefficients.  This  study  was  used  to 
illustrate  the  possibilities  for  multivariate  analyses  in 
hydrologic  studies,  and  no  rotation  of  the  principal  com¬ 
ponents  was  made,  because  the  component  regression  equation 
was  deemed  to  be  satisfactory. 

The  three  above  investigations  used  essentially  the 
same  techniques.  In  all  of  them,  the  ordinary  multiple 
regression  solution  was  unsatisfactory ,  and  multivariate 
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techniques  T,rere  used  to  over  cone  this  difficulty.  Each 
used  a  principal  component  analysis  to  obtain  truly  in¬ 


dependent  variables  for  the  regression.  Also,  rotation 
of  these  components  was  used  to  provide  information  on 
the  importance  of  certain  variables  in  two  of  the  pacers . 


RECOMHSHBATIOWS  of  the  literature 


The  previous  section  discussed  some  of  the  applica¬ 


tions  of  multivariate  analyses.  A  short  presentation  of 
several  recommendations  found  in  the  literature  follows. 
The  purpose  here  is  to  present  the  opinions  of  several 


authorities  on  the  use 


of  multivariate  methods  with  hy¬ 


drologic  data. 


Eiselstein’s  investigation  was  undertaken  as  a  re¬ 
sult  of  a  paper  by  Wallis  (19d5)»  which  suggested  the  ad¬ 
vantages  of  multivariate  methods  over  ordinary  multiple 


regression  for  hydrologic  studies.  Wallis  compared  the  pre¬ 
sently  used  methods  of  obtaining  an  equation  for  the  de¬ 
pendent  variable  in  terms  of  several  independent  variables, 
and  made  several  recommendations .  First,  he  suggested  that 
a  linear-logari thmic  transformation  model,  similar  to  Wong’s, 
be  used  with  hydrologic  data.  This  should  be  followed  by 
a  principal  component  regression  analysis  -with  varimam 
rotation  of  the  principal  components  for  an  initial  analysis 
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of  "multifactor  hydrologic  problems."  These  recommenda¬ 
tions  were  made  after  a  study  of  the  adequacy  of  the  methods 
in  obtaining  a.  known  functional  relationship,  the  weight  of 
a  solid  cylinder  in  terms  of  its  density  and  dimensions. 

Wallis  (1968)  presented  several  other  suggestions  for 
the  best  utilization  of  multivariate  statistical  methods  in 
hydrologic  studies.  First  he  suggests  that  no  more  than  two 
variables  important  to  any  factor  in  the  rotated  factor  table 
be  retained  for  further  study.  After  selecting  the  retained 
variables,  Wallis  suggests  that  a  complete  principal  com¬ 
ponent  analysis  with  varimax  rotation  of  the  principal  fac¬ 
tors  be  performed  on  the  original  measurements  of  only  these 
variables.  A  regression  on  these  factors  is  then  suggested 
for  the  prediction  equation. 

After  his  analysis  of  90  New  England  basins,  Wong 
(1963)  stated  that,  "multivatiate  methods  should  be  more 
widely  encouraged  in  geomorphic  and  hydrologic  research" 

(p.  198).  Eiselstein  recommended  that,  "a  principal  com¬ 
ponent  analysis  with  varimax  rotation  of  the  factor  weight 
matrix  is  a  suitable  statistical  technique  for  the  correla¬ 
tion  of  small  watershed  surface  characteristics  with  surface 
runoff"  (p.  484).  Although  Snyder  did  not  use  or  recommend 
varimax  rotation,  he  did  state  that  principal  component 
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regression  analyses  give  "logical  equations"  for  runoff 
when  compared  to  ordinary  multiple  regression  techniques. 

METHODS  APPLIED  TO  MONTANA  WATERSHEDS 

Multivariate  methods  have  not  been  previously  applied 
to  small  watersheds  in  central  and  eastern  Montana.  Boner 
(1963)  presented  a  report  of  his  study  of  the  frequency  and 
magnitude  of  floods  in  eastern  Montana  for  the  United  States 
Geological  Survey.  This  report  presents  an  ordinary  multi¬ 
ple  regression  equation  for  mean  annual  flood  in  terms  of 
area,  stream  meander  length,  geographical  region,  and  elev¬ 
ation  of  small  watersheds  in  eastern  Montana.  The  dependent 
variable  was  found  to  be  directly  proportional  to  the  first 
three  variables,  and  inversely  related  to  elevation.  Also, 
the  flood  having  a  recurrence  interval,  I,  is  obtained  by 
multiplying  the  mean  annual  flood  by  a  factor  from  a  "com¬ 
posite  flood  frequency  curve"  for  that  interval.  Variables 
other  than  those  listed  were  not  used  because  measurements 
were  not  available. 

Boner  and  Omang  (1967)  presented  a  report  on  the 
magnitude  and  frequency  of  floods  from  watersheds  smaller 
than  100  square  miles  in  area  in  Montana.  This  report  gives 
a  method  of  obtaining  the  floods  with  recurrence  intervals 
of  10  and  25  years  for  this  region.  The  10-year  flood  is 
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found  from  a  regression  equation  involving  the  area,  eleva¬ 
tion,  channel  slope,  and  mean  annual  runoff  of  a  watershed, 
and  the  25-year  flood  is  obtained  from  the  10-year  flood 
upon  multiplication  by  an  empirically  determined  constant. 

Both  of  the  above  studies  employed  ordinary  multiple 
regression  techniques  with  only  four  easily  determined  inde¬ 
pendent  variables. 

The  studies  of  Wong,  Eiselstein,  and  Snyder  used  more 
variables  than  four,  but  the  developed  equations  could  not 
reasonably  be  used  in  Montana.  However,  the  methods  should 
be  applicable  to  any  region.  None  of  the  literature  re¬ 
viewed  indicates  that  attempts  have  been  made  to  determine 
equations  for  both  the  peak  discharge  rate  and  the  runoff 
volume  in  a  single  analysis. 
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Chapter  III 

THEORETICAL  DEVELOPMENT 

In  this  chapter,  a.  survey  of  the  available  methods 
of  analyzing  runoff  from  small  watersheds  using  measure¬ 
ments  of  several  related  variables  is  presented,  followed 
by  a  discussion  of  the  reasons  the  specific  methods  for 
this  analysis  were  chosen.  A  complete  theoretical  devel¬ 
opment  of  the  methods  chosen  concludes  the  chapter. 

POSSIBLE  METHODS  OF  ANALYSIS 

Ordinary  multiple  linear  regression  is  one  of  the  few 
statistical  methods  of  simultaneously  analyzing  several 
variables  to  estimate  one  or  more  of  them.  The  analysis 
is  simply  an  analytical  method  of  plotting  a  line,  plane, 
or  hyperplane  through  a  multitude  of  data  points.  In 
general,  a  linear  equation  with  an  unknown  intercept  and 
slope  is  assumed,  and  the  intercept  and  slope  are  calcu¬ 
lated  from  the  data  in  a  manner  which  minimizes  the  squared 
distances  from  the  points  to  the  line,  plane,  or  hyperplane. 
Until  recently,  ordinary  multiple  linear  regression  of  the 
logarithms  of  the  variables  has  probably  had  the  most  use 
in  predicting  runoff  from  small  watersheds  (Sharp,  ejt  al, 
i960) . 
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The  greatest  objection  to  the  use  of  multiple  re¬ 
gression  analyses  with  hydrologic  data  is  that  certain 
basic  assumptions  of  multiple  regression  theory  are  vio¬ 
lated.  Multiple  regression  assumes  that  there  are  no  cor¬ 
relations  among  the  independent  or  "predictor"  variables 
(Thur stone,  194?) •  Hydrologic  variables,  as  shown  by 
Wong,  usually  violate  this  assumption.  If  ordinary  multi¬ 
ple  regression  is  attempted,  the  coefficients  of  the  multi 
pie  regression  equation  for  the  dependent  variable  have 
sometimes  been  found  to  be  "absurd”  and  "grossly  in  error" 
(Thur stone,  1947,  p.  61). 

Many  regression  equations  for  runoff  have  been  de¬ 
rived  using  many  combinations  of  independent  variables. 
Snyder  (1962)  studied  some  of  these  equations  and  conclud¬ 
ed  that  multiple  regression  analyses  do  not  yield  "logical 
equations"  when  used  in  hydrology.  In  these  cases,  he  was 
referring  to  the  coefficients  associated  with  each  of  the 
independent  variables  and  not  necessarily  with  the  accura¬ 
cy  of  the  prediction  equation.  A  presentation  of  the  mech 
anics  of  multiple  linear  regression  was  given  by  DuBois 
(1957),  and  Baggaley  (1964). 

Another  statistical  approach  to  the  problem  of  in¬ 
vestigating  several  variables  was  developed  by  Harris, 
et  al  (1961)0  Their  purpose  was  to  present  a  method  of 
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selecting  the  most  important  independent  variables,  and 
to  use  only  these  variables  in  the  regression  equation. 

A  Taylor  series  expansion  was  used  to  determine  succes¬ 
sively  the  most  important  variables  by  initially  removing 
the  effects  of  the  other  variables.  The  method  is  an 
analytic  approach  to  "graphical  curvilinear  multiple  re¬ 
gression,"  and  eliminates  the  usual  "shotgun"  search  for 
important  variables.  The  method  provides  a  statistical 
means  of  determining  the  important  variables,  but  because 
hydrologic  variables  are  generally  correlated,  as  discus¬ 
sed  earlier,  attempts  at  multiple  regression  are  often  not 
successful. 

Wong  (1963)  discusses  several  other  possible  methods 
of  analyzing  runoff  in  terms  of  several  independent  vari¬ 
ables.  One  of  these,  "stepwise  multiple  regression,"  is 
similar  to  Harris1  analysis  because  the  variable  which 
contributes  most  to  the  variation  in  the  dependent  vari¬ 
able  is  determined.  Its  effects  are  then  removed,  and 
the  second  most  important  variable  is  found.  This  pro¬ 
cess  is  repeated  until  enough  variables  are  found  to 
account  for  all  or  most  of  the  variation  in  the  dependent 
variable,  and  a  multiple  regression  is  performed  on  these 
variables.  Again,  the  relative  importance  of  the  inde¬ 
pendent  variables  is  indicated,  but  the  multiple  regression 
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assumption  may  be  violated,  Ralston  (i960)  presents  a 
development  of  stepwise  multiple  regression,  and  outlines 
the  procedures  for  programming  the  method  for  digital 
computers . 

Another  method  of  handling  this  problem  is  by  using 
multivariate  statistical  analyses.  Several  of  these  are 
outlined  in  the  next  section,  along  with  the  advantages 
and  disadvantages  of  each.  The  methods  chosen  for  use  in 
this  investigation  are  indicated,  and  the  reasons  for  the 
selection  are  outlined, 

MULTIVARIATE  STATISTICAL  METHODS 

One  commonly  used  multivariate  statistical  analysis 
is  known  as  "component  analysis,"  There  are  two  varieties 
of  component  analysis,  known  as  "principal  component  analy¬ 
sis"  and  "centroid  analysis,"  Both  are  mathematical  means 
of  obtaining  new  variables  or  "components"  from  the  inter¬ 
correlations  of  the  chosen  independent  variables.  The 
objective  in  the  analysis  is  that  the  components  obtained 
will,  (1)  be  fewer  than  the  number  of  independent  variables 
under  study,  (2)  represent  or  reproduce  the  original  vari¬ 
ables,  (3)  account  for  all  the  variation  in  the  original 
variables  and,  (4)  be  uncorrelated  even  if  the  original 
variables  were  highly  correlated  with  each  other.  Principal 
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component  analysis  extracts  the  new  variables,  or  ’’vari¬ 
ates,”  one  by  one.  The  first  component  is  extracted  in 
such  a  manner  that  it  reproduces  a  maximum  amount  of  the 
information  in  the  original  data.  The  second  component 
accounts  for  a  maximum  amount  of  the  information  remaining 
after  the  extraction  of  the  first  component.  This  process 
is  repeated  until  all  the  components  have  been  extracted, 
and  all  the  original  information  in  the  data  is  reproduced 
by  the  components . 

Centroid  analysis  has  been  termed  a  "simplified 
approximation  of  the  principal  components  solution"  (Cooley 
and  Lohnes,  1962,  p.  153) »  and  is  used  to  avoid  the  involved 
calculations  of  a  principal  component  analysis,  namely,  the 
solution  of  the  "characteristic  equation,"  defined  later. 
Kendall  (1957)  gives  an  example  of  the  "approximate"  cen¬ 
troid  solution  compared  to  the  principal  component  solution 
of  the  same  problem,  and  shows  the  different  solutions 
obtained.  When  computer  availability  obviates  most  concern 
for  involved  computations,  the  principal  component  analysis 
is  the  better  method. 

"Factor  analysis"  is  a  multivariate  method  similar  to 
component  analysis,  but  differing  in  the  general  method  of 
approaching  a  problem.  A  factor  is  defined  as  a  linear 
equation  in  terms  of  all  or  some  of  the  original  variables, 
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but  is  not  the  same  as  a  component.  Kendall  (1957)  states 
that  component  analyses  attempt  to  proceed  from  the  data  to 
a  model,  while  factor  analyses  begin  with  a  model  and  in¬ 
vestigate  its  agreement  with  the  results.  In  factor  analy¬ 
ses,  an  investigator  examines  the  data  and  speculates  on 
how  many  factors  or  groupings  of  variables  might  be  present. 
Guilford  (1952) »  in  a  discussion  on  when  to  factor  analyze, 
illustrated  this  point  by  writing:  "The  initial  planning 
should  emphasize  the  formation  of  hypotheses  as  to  what 
factors  are  likely  to  be  found  in  the  selected  domain  and 
to  the  probable  properties  of  such  factors"  (p.  35).  For 
example,  if  the  relationship  of  peak  discharge  rate  to 
several  variables  such  as  soil  moisture,  air  temperature, 
wind  speed,  watershed  area,  storm  duration,  stream  channel 
lengths,  excess  precipitation  intensity,  soil  permeability, 
etc.,  is  desired,  then  the  variables  might  initially  be 
viewed  as  being  composed  of  two  factors  or  variable  "group¬ 
ings,"  one  of  climate,  and  one  of  watershed  characteristics. 
The  results  of  the  factor  analysis  are  two  or  more  factors, 
and  an  examination  of  the  loadings  of  the  factors  will 
reveal  if  the  two-factor  assumption  was  correct.  In  hydrol¬ 
ogy*  factors  might  easily  be  formulated,  although  none  of 
the  reviewed  papers  employed  this  method,  Eiselstein  (1967) 
grouped  the  original  independent  variables  into  four 
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categories,  but  he  did  not  initially  state  that  he  expected 
a  factor  for  each  category.  His  first  principal  component 
could  have  been  termed  a  "rainfall”  factor,  but  the  remain¬ 
ing  components  were  not  readily  associated  with  the  cate¬ 
gories.  His  purpose,  that  of  principal  component  analyses, 
was  to  derive  new,  truly  independent  variables  for  a  mul¬ 
tiple  linear  regression.  Components  are  independent,  and 
the  combined  effect  of  the  variables  within  each  component 
is  independent  of  the  effect  of  other  components.  This 
means  that  the  important  variables  within  a  component  are 
not  necessarily  from  the  same  category,  as  is  hopefully 
the  case  with  factor  analyses.  No  initial  assumptions 
about  the  outcome  are  made  with  component  analyses.  The 
data  is  analyzed  only  for  uncorrelated  components,  and 
the  model  differs  from  that  of  a  factor  analysis  in  the 
above  manner.  If  factors  result,  they  are  coincidental, 
but  are  desirable  because  they  give  information  about  the 
importance  of  groups  of  variables. 

Neither  a  factor  analysis  nor  a  component  analysis 
provides  information  about  the  importance  of  each  independ¬ 
ent  variable.  The  coefficients  on  the  variables  are  cor¬ 
relations  of  the  variables  with  the  factors  or  components, 
but  they  generally  are  relatively  large  in  magnitude  for 
all  the  variables.  If  one  or  more  variables  in  a  component 
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analysis  has  a  small  correlation  with  all  the  components, 
then  the  variable  is  not  important  to  any  of  the  components, 
and  hence  to  the  total  problem.  Because  the  first  component 
is  found  in  a  manner  which  yields  high  correlations  with 
all  the  variables,  no  interpretations  can  be  made,  and 
the  components  are  useful  only  when  regression  of  uncor¬ 
related  variables  is  desired. 

Because  a  principal  component  analysis  does  not  usu¬ 
ally  provide  information  about  the  importance  of  each  inde¬ 
pendent  variable,  another  multivariate  method,  ’’rotation  of 
the  principal  components,”  is  in  general  use.  The  compo¬ 
nents  can  be  rotated  so  that  each  component  has  high  coef¬ 
ficients  on  certain  variables  and  low  coefficients  on  other 
variables,  allowing  interpretations  for  the  important  vari¬ 
ables,  and  obtaining  the  ’’simple  structure"  of  the  compo¬ 
nents  (Matalas,  1967).  Graphical  rotation  presentations 
are  given  by  Fruchter  (1954)  and  Baggaley  (1964).  Both 
authors  use  two-dimensional  plots  which  show  the  measure¬ 
ments  of  the  variables  as  points,  and  rotated  components 
are  simply  lines  drawn  through  clusters  or  "streaks”  of 
points  so  as  to  maximize  zero-loadings  on  the  components. 
However,  the  graphical  solutions  are  approximate,  and 
different  investigators  might  obtain  different  results 
with  the  same  data.  Kaiser  (1958,  1959)*  derived  an 
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analytic  rotation  which  guarantees  the  same  results  for 
different  investigators.  This  method  is  known  as  "varimax 
rotation,"  and  has  been  utilized  in  the  literature  (Rice, 
1967;  Wallis,  1965;  Wong,  1967;  Siselstein,  1967). 

Rotation  of  the  variates  from  either  a  component  or 
factor  analysis  has  no  effect  on  the  amount  of  information 
retained  by  the  variates.  Only  the  interpretation  of  the 
loadings  is  affected  (Cooley  and  Lohnes,  1962).  Kaiser1 s 
"normal"  varimax  rotation  not  only  minimizes  the  "in- 
between"  loadings,  but  it  also  maintains  an  orthogonal  or 
perpendicular  reference  frame  (Wallis,  1965;  Kaiser,  1958). 
This  feature  is  desirable  if  a  multiple  regression  on  the 
rotated  components  is  to  be  performed,  because  perpendicu¬ 
larity  of  the  components  means  that  they  are  uncorrelated, 
and  the  assumptions  of  multiple  regression  are  not  violated. 
"Oblique,"  or  non-perpendicular  analytical  rotations  such 
as  the  "Quartimin, "  "Oblimin,"  or  "Covarimin"  have  been 
developed,  but  do  not  appear  satisfactory  (Cooley  and 
Lohnes,  1962). 

Multivariate  methods  are  advantageous  in  several 
aspects.  Ordinary  multiple  regression  provides  good  pre¬ 
diction  results,  but  gives  no  insight  into  the  interre¬ 
lationships  of  the  variables.  Multivariate  methods  obviate 
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the  effects  of  highly  correlated  "independent”  variables, 
while  ordinary  multiple  regression  analyses  do  not. 
Interpretations  of  the  results  of  multivariate  analyses 
allow  the  exclusion  of  unimportant  variables  and  the  rec¬ 
ognition  of  the  more  important  variables. 

Multivariate  methods  also  allow  the  reduction  of  the 
number  of  variates  for  multiple  regression.  Component 
analyses  produce  exactly  as  many  new  variates  as  there  are 
independent  variables.  However,  since  the  extraction  is 
done  on  a  "priority"  basis,  some  of  the  latter  variates 
may  reproduce  only  a  small  portion  of  the  information, 
and  may  be  excluded.  Hong  (1967),  Eiselstein  (1967),  and 
Rice  (1967)  were  all  able  to  considerably  reduce  the  number 
of  variates  needed  to  reproduce  almost  all  of  the  informa¬ 
tion  present. 

Besides  allowing  the  interpretation  of  the  importance 
of  each  variable  to  the  original  data,  the  orthogonality 
of  the  new  variates  is  a  principal  achievement.  Because 
the  data  can  be  expressed  by  uncorrelated  variates,  then 
multiple  regression  assumptions  are  not  violated  if  the 
regression  is  performed  on  these  variates.  The  new  inde¬ 
pendent  variates  reproduce  the  original  data  and  are  truly 
uncorrelated.  Because  of  this,  the  correlations  among  the 
independent  variables  are  indicated  with  multivariate 
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methods  by  producing  uncorrelated  variates. 

Still  another  advantage  of  multivariate  methods  with 
correlated  variables  is  the  improvement  in  the  final  re¬ 
gression  equation  coefficients.  Snyder  (1962)  stated  that 
multivariate  methods  yield  "nice"  coefficients  which  in¬ 
dicate  the  relative  importance  of  each  independent  variable 
to  the  criterion.  Wallis  (1965)  demonstrates  that  princi¬ 
pal  component  regression  coefficients  tend  to  be  "stable" 
when  compared  to  ordinary  regression  coefficients.  In 
a  discussion  of  RiceTs  paper,  Anderson  (1967)  stated  that 
the  "coefficients  remained  very  distinctly  realistic  with 
regression  on  principal  components"  (p.  6).  This,  and  the 
other  advantages  of  multivariate  methods  seem  to  provide 
justification  for  their  use  in  hydrologic  studies. 

Investigators  of  the  many  multivariate  methods  agree 
that  the  best  statistical  system  of  analyzing  hydrologic 
data  would  start  with  a  principal  component  extraction, 
followed  by  a  varimax  rotation  for  interpretation  of  the 
variables,  and  a  multiple  regression  on  either  the  principal 
components  or  the  rotated  factors  for  the  prediction  equa¬ 
tion  (Siselstein,  1967;  Wallis,  1965;  Wong,  1967;  Anderson, 
I967).  Baggaley  (1964),  however,  states  that  good  rotation 
results  are  not  likely  unless  more  than  20  variables  are 
involved.  Because  29  independent  variables  were  available 
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for  the  analysis  reported  herein,  multivariate  methods 


were  indicated  for  the  present  study. 

D5VEL0PH5NT  OP  METHODS 

The  methods  presented  in  this  section  were  developed 
by  Kendall  (1957)  and  Kaiser  (1958),  and  most  of  their 
equations  and  derivations  are  repeated  herein.  Graphical 
interpretations  of  the  methods  are  included  in  the  Appendix. 
Some  of  the  steps  in  Kendall’s  and  Kaiser’s  derivations 
were  omitted  in  their  reports,  and  are  included  in  the 
present  development  because  of  their  importance. 


Model 

In  all  studies  which  involve  regression  as  a  final 
analysis,  a  model  for  the  regression  must  be  assumed.  The 
variables  may  be  powered,  multiplied,  divided,  or  added. 

In  general,  multiple  regression  analyses  provide  the  coef¬ 
ficients  for  linearly  additive  independent  variables.  This 
equation  usually  has  the  form 


Y 


Bq  Xq  +  B^  X^  + 


+  B  X 
P  P 


(i) 


where  Y  is  the  dependent  variable 
th 

the  i —  independent  variable,  and 
sion  coefficient.  In  general,  XQ 
tion  is  that  of  a  ”  hyper  plane’1  in 


or  criterion,  X.  is 

l 

B^  is  the  desired  regres- 
is  unity,  and  the  equa- 
(p-1)  dimensions. 
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Hydrologic  variables  are  generally  multiplicative  in 
nature  (Wallis,  1965)1  and  plotting  one  variable  against 
another  on  logarithmic  paper  usually  approximates  a  linear 
relationship.  For  this  reason,  hydrologists  often  assume 
that  the  dependent  variable  will  be  related  to  the  p  inde¬ 
pendent  variables  in  the  form 
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X^3  3 
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(2) 


Upon  taking  logarithms  of  this  equation,  a  linear  equation 
having  the  form 


Log  Y  +  Log  3q  +  Log  X^  +  B2  Log  +  ...  +  Bp  Log  Xp  (3) 

is  obtained.  If  the  logarithms  of  the  variables  are  used 
as  variables,  then  a  linear  multiple  regression  may  be 
performed  that  will  provide  the  coefficients  for  Equation 
(3).  Equation  (3)  is  the  model  used  herein  to  represent 
the  equations  for  peak  discharge  rate  and  total  runoff 
volume.  The  logarithm  to  the  base  10  of  the  measured 
variable  is  represented  hereafter  by  X^. 


Principal  Component  Analysis  Theory 
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The  attempt  in  a  principal  component  analysis  is  to 
find  a  set  of  linea,r  equations  which  reproduce  the  infor¬ 
mation  present  in  a  set  of  measurements  of  several  inde¬ 
pendent  variables.  Hopefully,  the  number  of  linear 
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equations  required  will  be  less  than  the  number  of  vari¬ 


ables  rather  than  several.  These  new  variables  are  refer¬ 
red  to  as  "principal  components,"  or  simply  as  "components" 

in  the  literature.  The  linear  form  of  the  components,  for 
th  th 

the  i-^~  value  of  the  j— ■-  component,  is 
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where  x,  .  is  the  i 
ki 


.  th 


measurement  of  the  standardized  form 


of  the  variable  Xv,  or 
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and  1,  .  is  an  undetermined  coefficient  for  the  k-~-  vari- 

kj 

th  — 

able  and  the  j —  component.  The  other  terms,  X^  and  s^ 

are  the  mean  and  standard  deviation,  respectively,  of  the 

th 

k —  variable.  The  purpose  of  standardization  is  to  give 


the  standardized  variables  a  mean  of  zero,  and  a  variance 
equal  to  unity.  For  n  observations  of  the  standardized 
variable  x^f  the  mean  is 
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and  the  variance  is 
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Standardization  does  not  change  the  importance  of  the 
variables,  and  the  terms  are  now  dimensionless  because 
the  standard  deviation  has  the  dimensions  of  the  variable. 


rm 


herefore,  if  the  variables  are  standardized  before  the 
analyses  are  begun,  the  effects  of  the  different  dimensions 
of  each  variable  are  not  present,  and  better  interpreta¬ 
tions  are  possible. 

To  solve  for  the  1^  values  in  Equation  (4),  two 
conditions  must  be  satisfied.  First,  it  is  desired  that 
the  components  be  statistically  uncorrelated  (truly  inde¬ 
pendent)  so  that  a  regression  of  the  components  may  be 
performed  without  violating  the  independence  assumption. 
This  means  that  the  “correlation  coefficient”  for  any  two 
components  must  equal  zero.  The  correlation  coefficient  is 
a  statistical  measure  of  the  degree  of  linear  association 
between  two  variables.  The  value  of  the  coefficient  ranges 
between  plus  one  and  minus  one,  depending  on  the  increase 
or  decrease  of  one  variable,  respectively,  as  the  other 
variable  is  increased.  A  correlation  of  plus  or  minus  one 
would  mean  that  the  variables  were  perfectly  related,  and 
a  line  plotted  on  a  two-dimensional  graph  would  intersect 
every  point.  For  any  two  components  to  be  uncorrelated. 


be 


0 
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must  be  satisfied,  where  r  is  the  correlation  coefficient, 
and  b  and  c  are  the  subscripts  of  the  respective  compo¬ 
nents  . 

The  second  condition  for  solution  for  the  1,  .  values 

kj 

in  Equation  (4)  is  that  the  first  component  must  reproduce 
a  maximum  amount  of  the  information  in  the  original  meas¬ 
urements  of  the  variables.  A  statistical  measure  of  the 
information  contained  by  a  component  is  the  variance  of  the 
component.  Because  the  variables  are  standardized,  they 
each  have  a.  variance  of  unity,  and  the  total  variance  is 
equal  to  p,  because  there  are  p  independent  variables.  If 
the  first  few  components  are  to  reproduce  the  information 
in  the  measurements  of  the  variables,  then  the  sum  of  the 
variances  of  the  components  must  also  equal  p.  Therefore, 
the  component  which  has  a  maximum  variance  is  desired.  To 
determine  the  component,  it  is  necessary  to  write  the 
equation  for  the  variance  of  any  component  in  terms  of  the 

unknown  1,  .  values,  and  then  equate  the  partial  differen- 
&  J 

tials  of  this  equation,  with  respect  to  1^,  to  zero. 

This  procedure  gives  the  1^  values  which  maximize  or 
minimize  the  variance  of  the  first  component. 

Kendall  (1957,  p.  14)  takes  the  partial  differentials 
of  the  equation  for  the  variance  of  a  component  and  finds 
the  solution  which  minimizes  or  maximizes  the  variance. 
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If  is  the  correlation  of  any  two  variables  defined  by 


r  . 


1  3- 

b  n 


jk  n  xji  xki 

and  if  L  is  an  "undetermined  multiplier"  (see  Appendix  A 
for  a  graphical  derivation) ,  then  the  criterion  for  the 
extremum  is  the  determinant  (Kendall,  pc  15) 
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rpl  rp2  rp3  (1~L) 
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(10) 


If  I  is  defined  as  an  identity  matrix,  then  the  determinant 
can  be  written 


R  -  LI  =0 


(ID 


in  matrix  notation.  The  matrix  of  correlation  coefficients 
of  the  standardized  independent  variables,  R,  is  easily 
computed  by  applying  Equation  (9)  to  all  combinations  of 
variables,  and  L  is  the  only  unknown. 

Equation  (11)  is  referred  to  as  the  "characteristic" 
equation  of  the  correlation  matrix.  Several  solutions  to 
this  equation  are  available  (Cooley  and  Lohnes,  1962;  Ral¬ 
ston,  i960)  which  yield  the  Lfs  and  a  set  of  l^j  values 
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for  each  L.  Also,  there  are  generally  p  values  of  L  which 


satisfy  Equation  (11).  These  values  are  called  "eigen¬ 
values,"  "latent  roots,”  "characteristic  roots,"  "proper 
values,"  or  "characteris tic  values"  by  various  authors.  The 


term  "eigenvalues"  is  used  herein. 

The  set  of  l^j  values  corresponding  to  any  L  is  known 
as  the  "eigenvector"  for  that  eigenvalue.  If  each  1^ 
value  in  an  eigenvector  is  divided  by  the  square  root  of 
the  sun  of  squared  1.  .  values,  then  a  "normalized  eigen- 
vector"  results.  Some  authors  (Kendall,  1957)  call  the 
normalized  eigenvectors  "principal  components,"  and  this 
notation  is  used  herein.  Other  authors  (Eiselstein,  1967; 
Harman,  1967)  multiply  the  normalized  eigenvector  "loadings" 
by  the  square  root  of  the  respective  eigenvalue  and  call 
this  new  vector  the  principal  component,  or  "principal 
factor."  In  the  present  analysis,  a  principal  component 
is  identical  to  a  normalized  eigenvector,  and  a  factor, 
whenever  cited,  is  meant  to  be  defined  as 


A.  = 
J 


L.  V, 
vj  3  J 


(12) 


where  the  normalized  eigenvector,  V.,  is  defined  by  Equation 

J 

(4),  and  L.  is  the  eigenvalue  corresponding  to  the  j— 

J 

eigenvector.  The  1-^  values  in  Equation  (4)  are  coeffi¬ 
cients  of  normalized  eigenvectors. 


I 


t 


37 


Kendall  ( 1 957 •  P*  15)  shows  that  the  variance  of  a 
component  is 

n  p  2 

Variance  (V.)  =  H  (1_  1  x,  .  )  (13) 

J  i=l  k=l 

which  is  also  the  component’s  eigenvalue.  This  means  that 
the  component  having  the  largest  eigenvalue  reproduces  a 
maximum  amount  of  the  information  contained  in  the  inde¬ 
pendent  variable  correlations.  The  component  having  the 
second  largest  eigenvalue  reproduces  the  next  largest 
amount  of  the  information,  etc.,  for  all  p  components. 

The  sum  of  the  eigenvalues  for  a  ” symmetric”  matrix, 
such  as  the  correlation  matrix  used  herein,  equals  the  sum 
of  the  principal  diagonal  entries  (Hotelling,  1933,  p.  429). 
Because  a  principal  component  analysis  uses  the  self-corre¬ 
lations  of  the  variables  in  this  diagonal,  then  this  sum 
is  equal  to  p,  because  p  independent  variables  are  used. 
Also,  as  shown  earlier,  the  total  variance  in  the  stand¬ 
ardized  independent  variables  is  equal  to  p,  and  the  eigen¬ 
values  are  the  variances  of  each  component.  The  total 
variance  of  the  components  is  equal  to  the  total  variance 
present,  and  the  components  therefore  contain  all  the 
information  in  the  correlation  matrix,  which  in  turn  con¬ 
tained  the  information  in  the  measurements  of  the  data. 

Also,  the  variance  of  es.ch  component  is  the  portion  of  the 
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total  information  reproduced  by  the  component*  The  per 
cent  of  the  total  variance  "accounted"  for  by  any  compo¬ 
nent  is  therefore  given  by 

Li 

%  Variance  =  —  (100)  (14) 

P 

Earlier,  it  was  stated  that  as  few  components  as 
possible  were  desired  to  account  for  the  total  variance 
in  the  independent  variables.  Because  each  successive 
component  accounts  for  a  maximum  amount  of  the  remaining 
variance  in  the  original  variables,  then  some  of  the  later 
components  can  be  expected  to  have  small  eigenvalues. 
Originally,  there  are  p  components  accounting  for  100  per 
cent  of  the  variance.  If,  for  example,  30  independent 
variables  were  under  investigation,  then  30  components 
would  be  computed.  However,  if  10  of  the  components  ac¬ 
counted  for  90  per  cent  of  the  variance,  then  20  components 
could  be  dropped  if  the  10  per  cent  loss  in  information 
was  justified  by  the  reduction  in  variables  for  regression. 
The  regression  equation  would  contain  the  30  original 
variables,  but  would  be  performed  on  the  10  uncorrelated 
components  and  the  dependent  variable,  and  not  on  the 
30  independent  variables.  The  resulting  equation  should 
be  a  better  prediction  equation  than  one  obtained  from 
ordinary  multiple  linear  regression  with  all  30  variables, 
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because  the  regression  assumption  of  independence  would 
not  be  violated. 

No  analytical  method  of  determining  which  components 
are  important  could  be  found  in  the  literature.  Only  sub¬ 
jective  observations  of  the  values  have  been  used,  Kendall 
(1957)  uses  only  those  components  whose  eigenvalues  are 
significantly  larger  than  others.  In  one  example,  he  per¬ 
formed  a  principal  component  analysis  on  five  variables  and 
arrived  at  five  eigenvalues:  3.2470,  1.2753,  0.3859, 

0.0700,  and  0.0218,  totaling  to  5*0000.  Because  of  the 
large  difference  between  the  third  and  fourth,  he  used  only 
the  first  three  components  in  his  regression  equation  for 
beer  consumption,  accounting  for  98  per  cent  of  the  orig¬ 
inal  variance.  Cooley  and  Lohnes  (1962,  p.  160)  state  that 
if  unities  were  used  in  the  principal  diagonal  of  the  cor¬ 
relation  matrix,  then  only  those  components  whose  eigen¬ 
values  are  greater  than  unity  should  be  used  in  future 
analyses.  Wallis  (1965)  recommends  the  use  of  as  many  new 
variates  as  are  needed  to  account  for  as  high  as  99.5  per 
cent  of  the  variance,  thereby  retaining  most  of  the  orig¬ 
inal  variance.  Other  authors  are  aware  of  this  problem, 
but  are  not  specific  in  their  methods  of  solution.  Ander¬ 
son  (1967)  agrees  with  Kendall’s  method  of  observing  a 
’’large  change”  in  the  eigenvalues.  Hotelling  (1933,  p.  421) 
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suggests  that  the  components,  "whose  contributions  to  the 
total  variance  are  small"  be  neglected.  Eiselstein  (1967) 
neglected  those  components  (dimensions)  whose  contributions 
to  the  total  variance  were  less  than  one  per  cent.  Because 
of  the  relatively  small  use  of  principal  component  analyses 
to  date,  no  mathematical  method  for  determining  which 
variates  to  use  has  yet  been  established.  According  to 
the  literature,  the  investigator  must  analyze  his  own 
results  for  the  solution  to  this  problem. 

In  the  above  paragraphs,  it  has  only  been  stated  that 
the  components  derived  are  uncorrelated,  or  that  Equation 
(8)  applies.  Kendall  (1957*  p.  16)  proves  that  the  cor¬ 
relation  of  any  two  components  is  zero,  and  that  all  the 
components  are  perpendicular,  forming  a  "p-dimensional 
reference  frame.”  Eor  this  reason,  principal  components 
are  sometimes  referred  to  as  "principal  axes." 

The  principal  components  provide  a  means  of  deriving 
uncorrelated  variates  for  multiple  regression.  Graph¬ 
ically,  they  are  reference  axes  drawn  through  the  meas¬ 
urements  of  the  independent  variables  so  that  a  maximum 
amount  of  the  information  contained  by  the  measurements 
is  explained  by  the  components.  However,  no  information 
about  the  relative  importance  of  the  original  variables 
is  presently  available,  and  this  is  the  purpose  of  the 
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next  section,  which  presents  varimax  rotation  theroy. 


Varimax  Rotation  Theory 

Varimax  rotation  of  normalized  factors  from  a  factor 
analysis  was  originated  by  Kaiser  (1953).  Kaiser  (1959) 
developed  a  computer  program  to  perform  the  rotation. 

A  graphical  explanation  of  factor  rotation  was  not  pre¬ 
sented  by  Kaiser 9  and  is  therefore  presented  in  Appendix  3. 

The  set  of  axes  through  the  data  points  which  gives 
the  best  interpretation  of  the  contribution  of  each  inde¬ 
pendent  variable  to  the  information  is  accomplished  with 
varimax  rotation  by  the  rotation  of  the  principal  axes 
discussed  previously.  If  the  b^  values  are  the  desired 
coefficients  for  the  rotated  principal  factors,  and  if 

the  a,  .  values  are  the  coefficients  from  Equation  (12), 
kj 

4- V| 

then  the  variance  of  the  k—  rotated  factor  is  given  by 
Kaiser  (1959)  as 


Variance  (A^) 


(15) 


which  he  calls  the  "varimax  criterion"  for  maximum  inter¬ 
pretation  (see  Appendix  3  for  a  graphical  interpretation). 

Because  the  a-^j  values  are  known,  and  because  the  b^j 
values  are  geometric  functions  of  the  a,  .  values  and  the 
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angle  of  rotation  of  the  principal  axes,  0,  which  maximizes 
the  criterion,  then  the  angle  is  the  only  unknown,  and 


bk-l 

=  akl  cos  0  +  ak2 

sin 

0 

( 16a ) 

\2 

=  ak2  cos  6  -  akl 

sin 

0 

(16b) 

for  the  first  and  second  factors  (Kaiser,  1959) •  The 
desired  angle  in  the  plane  of  these  factors  can  be  obtain¬ 
ed  by  substituting  b,  .  values  from  Equations  (16)  into 

x  j 

Equation  (15) »  dif f erentiating  with  respect  to  ©,  and 
solving  for  the  angle  which  makes  the  differential  equal 
to  zero.  After  the  angle  is  found,  the  values  of  b.  ^  and 
blr£  can  be  computed  from  Equations  (1 6),  giving  the  loadings 
of  the  rotated  factors.  If  these  single-plane  rotations 
are  made  for  all  combinations  of  two  factors,  then  the 
complete  set  of  b^j  values  can  be  found.  Substitution 
of  these  values  into  Equation  (15)  will  give  a  value  of 

the  criterion  which  should  be  a  maximum,  and  the  b,  .  values 

J 

are  the  coefficients  relating  the  rotated  factors  to  the 
independent  variables.  The  conditions  for  a  maximum  of 
Equation  (15)  are  found  by  Kaiser  who  takes  the  second 
derivative  of  Equation  (15)  with  respect  to  9,  and  solves 
for  the  limits  on  ©  which  make  the  derivative  negative. 

If  0  for  any  single  plane  rotation  is  not  within  these 
limits,  then  the  angle  is  equated  to  the  nearest  limit, 
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and  a  second  rotation  is  made  from  this  point.  This  proc¬ 
ess  will  converge  to  the  solution  for  Equation  (15)  which 
is  a  maximum.  (Kaiser,  1959)  »  giving  the  desired  b^.j  values. 

Because  the  perpendicular! ty  of  the  factors  is  main¬ 
tained  by  Kaiser Ts  normal  varimax  rotation,  the  final 
factors  are  uncorrelated  and  provide  independent  variates 
for  regression.  The  principal  components  are  also  perpen¬ 
dicular,  but  give  no  information  about  the  variables  because 
the  coefficients  are  not  derived  for  this  purpose.  If 
certain  variables  are  not  highly  correlated  with  any  of 
the  rotated  factors,  then  they  do  not  contribute  to  the 
information  contained  by  the  factors,  and  they  may  be 
deemed  unimportant  to  the  information.  Also,  each  rotated 
factor  accounts  for  a  percentage  of  the  total  information, 
and  if  some  factors  are  not  important,  then  any  variables 
correlated  only  with  these  factors  are  also  unimportant. 
Certain  independent  variables  may  therefore  be  regarded 
as  unimportant  and  excluded  from  the  regression  analysis 
of  the  rotated  factors. 

Multiple  Regression  of  Principal  Components  and  Rotated 
Factors 

A  principal  component  analysis  could  be  performed 
with  no  rotation  for  interpretation  of  the  variables,  if 
the  independent  variables  used  were  all  known  to  be 
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important.  A  multiple  regression  of  the  principal  com¬ 
ponents  would  give  an  equation  for  the  dependent  varia.ble 
in  terms  of  all  the  independent  variables  with  no  regres¬ 
sion  assumption  violations.  If  this  was  the  only  purpose 
of  the  analysis,  then  one  of  two  methods  could  be  used 
to  obtain  the  regression  coef f icients  of  the  independent 
variables . 

One  method  of  obtaining  the  coefficients  for  the 
principal  components  is  to  solve  the  equations  of  the 
components  using  all  the  measurements  of  the  original 
independent  variables,  giving  n  sets  of  values  for  the 
s  components,  one  set  for  each  measurement  of  the  dependent 
variable.  The  log- transformed  data  from  one  measurement 
of  all  the  variables  is  substituted  into  Equation  (4) 
for  all  s  components,  giving  one  set  of  the  new  independ¬ 
ent  variates  for  the  respective  measurement  of  the  depend¬ 
ent  variable.  This  is  repeated  for  all  observations, 
and  an  ordinary  linear  multiple  regression  of  the  depend¬ 
ent  variable  and  the  values  for  the  components  is  per¬ 
formed.  This  gives  the  regression  constant  and  coeffi- 
cients  for  the  equation  of  the  i — :  observation 

yi  =  F0  +  d4  vji  (17) 

where  the  F  ^  values  are  the  regression  coefficients  of  the 
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components,  and  y  is  the  standardized  dependent  variable. 


Substituting  Equation  (4)  for  V..  gives  the  regression 
equation  in  terns  of  the  original  standardized  independent 
variables 

s  p 

y.  =  Fn  +  y~~  F.  T  1,  .  x  (18) 

1  °  f= 1  3  k=i  kl 

If  y  and  x^  are  the  standardized  form  of  the  logarithms 
of  the  original  variables,  then 


Log  Xki  -  Log 


'Log  X 


k 


(19) 


where  X,  .  is  the  i—  measurement  of  the  variable  X,  . 

K 1  K 

Substituting  this  and  a  similar  equation  for  y  into  Equa¬ 
tion  (18)  gives  the  regression  equation  in  terms  of  the 
measured  variables 


Equation  (20) 


Log  Y  =  Bk  Log 

k=l 

is  valid  only  if 

s 


s  2_  (F 

Log  Y  j=1  J 


X,  +  C 
k 


'Log  X 


(20) 


and  if 


C  =  s_  F  +  Log  Y 

Log  Y  0 


-  ^ 


s  v 

.  Y  n  n  (p, 

L°o  Y  k=l  3 


li 


s_  v 
nog  X 


Log  Xk) 


k 
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Equation  (20)  reduces  to  the  form  desired,  given  by  Equa¬ 
tion  (2)  as 


B, 


B, 


v  = 


Bo  xi 


X. 


B3 

X3  3 


(2) 


if 


BQ  =  Antilog(C) 

Because  the  original  data  must  be  used  twice  for 
the  computation  of  the  coefficients  in  Equation  (2); 
once  for  the  correlation  matrix  computation,  and  once 
in  the  regression  analysis,  a  better  method  of  obtaining 
the  coefficients  with  fewer  computations  may  be  used. 

Kendall  (1957)  shows  that  the  regression  coefficients  for 
the  principal  components  (normalized  eigenvectors)  can 
be  computed  from  the  eignevalues  by  using  the  equation 

P/kJ  rky 

p .  =  -um -  (2D 

0  L, 

J 

which  can  be  derived  using  multiple  regression  theory. 

The  values  of  r  are  the  correlation  coefficients  of  the 

ky 

th 

k —  standardized  independent  variable  with  the  dependent 

variable  y,  and  are  known  from  the  correlation  analysis. 

The  other  terms  are  defined  earlier,  and  the  F.  values 

J 

can  be  computed  without  returning  to  the  original  data. 
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Once  the  P.  values  are  found,  Equation  (2)  gives  the  de- 
3 

sired  relationship. 

A  simplified  method  for  multiple  regression  of  ro¬ 
tated  factors  could  not  be  found  in  the  literature,  but 
a  method  exactly  the  same  as  the  method  indicated  by 
Equation  (17)  for  principal  component  regression  may  be 
employed.  Because  the  rotated  factors  are  given  by  the 
equation  (see  Appendix  B) 


Aji  ckj  xki 


(22) 


where  the  prime  indicates  a  rota, ted  factor,  and  the  c,  . 

A  J 

values  are  the  factor  loadings,  then  Equations  (17)  through 
(20)  may  be  applied  to  the  rotated  factors,  giving 


y .  —  G  A  t  ^  G  .  A  . . 
1  0  jti  3  01 


(23a) 


s 

g  (G  , 


ckj 


Bk  =  SLog  Y  -i  ' ‘  ~ i  *Los  X 


k 


(23b) 


C  =  SLog  X  G0  +  L°S  Y 


-  P, 


s  p 

T  v  21  21  (g  .  ~ 

L°g  Y  j=1  k=1  3  S 


'k3 


Log  X 


Log  Xk) 


k 


where  the  G.  values  are  from  an  ordinary  linear  multiple 

J 

regression  of  the  rotated  factors  and  dependent  variable, 
using  only  those  independent  variables  deemed  important. 
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Unlike  the  component  regression,  an  equation  for  the  de¬ 
pendent  variable  in  terms  of  the  important  independent 
variables  is  obtained  with  Equation  (23a).  This  equation, 
or  the  one  obtained  from  a  principal  component  regression, 
should  give  reasonable  predictions  of  the  dependent  vari¬ 
able,  and  should  exhibit  sensible  signs  and  magnitudes 
for  the  coefficients  of  the  variables. 

Summary 

The  methods  reviewed  herein  begin  with  the  computa¬ 
tions  of  the  correlations  of  all  the  standardized  vari¬ 
ables,  dependent  and  independent.  This  is  followed  by 
the  principal  component  analysis  of  the  correlations  among 
the  independent  variables.  The  components  are  then  con¬ 
verted  to  factors  which  are  rotated,  allowing  the  inter¬ 
pretation  of  the  importance  of  the  independent  variables 
to  the  information  contained  in  the  original  measurements. 
If  the  measurements  of  all  the  independent  variables  may 
be  obtained,  then  the  principal  component  regression  equa¬ 
tion  provides  the  best  prediction.  Otherwise,  an  adequate 
regression  equation  from  the  rotated  factors  and  impor¬ 
tant  independent  variables  may  be  used. 


' 

■ 
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Chapter  IV 
ANALYSIS  OF  DATA 


The  procedures  developed  in  Chapter  III  were  applied 
to  the  watershed  data  in  the  order  discussed  in  that  chap¬ 
ter.  The  methods  used  and  results  obtained  in  recording 
the  data,  selecting  the  variables,  computing  the  values  of 
the  variables,  and  analyzing  the  values  are  presented  in 
this  chapter. 

DATA  RECORDED 

The  measurements  of  29  independent  variables  and  two 
dependent  variables  for  50  runoff  events  were  obtained 
from  continuous  data  taken  from  five  central  and  eastern 
Montana  watersheds.  The  watersheds  were  selected  and 
instrumented  in  19^3  (Williams,  1965).  Residents  living 
on  or  near  the  watersheds  were  consulted,  and  a  system 
for  instrumentation  of  the  watersheds  was  established  by 
those  residents  who  agreed  to  service  and  maintain  the 
instruments.  Table  I  gives  the  watershed  names,  locations, 
sizes,  and  ins trumentation  numbers.  The  weather  stations 
consisted  of  instruments  which  measured  and  recorded,  at 
half-hour  intervals,  the  following  information:  soil 
moisture  (per  cent  on  a  dry  weight  basis)  and  temperature 
(deg  F)  at  3»  9*  and  18  inches  below  ground  surface;  air 
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Table  I.  Instruments 

Located 

on  Watersheds 

Watershed  Lone  Man 

Bacon 

Hump 

Duck  East  Fork 

Coulee 

Creek 

Creek 

Creek  Duck  Creek 

County 

Pondera 

Wheat- 

land 

Sweet- 

grass 

McCone  & 
Praire 

Praire 

Area 
(sq  mi) 

14.10 

17.97 

7.61 

53.79 

13.67 

Runoff 

Events 

20 

4 

3 

10 

13 

Non-recordin 
Rain  Gages 

g  2 

3 

1 

4 

4 

Recording 
Rain  Gages 

2 

2 

2 

3 

2 

Snow 

Courses 

3 

3 

3 

3 

3 

Weather 

Stations 

2 

2 

1 

3 

3 

Wind  - 

A 

Stations 

1 

1 

1 

1 

1 

Water- stage 
Recorders 

1 

1 

1 

2 

1 

Some  weather  stations  did  not  have  instruments  for  meas¬ 
uring  wind  speed  and  direction. 

(The  instrument  stations  for  East  Fork  Duck  Creek  were  the 
same  as  for  Duck  Creek). 


temperature  (deg  F)  at  4  and  10  feet  above  the  surface; 
and  wind  speed  (mph)  and  direction  at  10  feet  above  the 
surface.  United  States  Geological  Survey  (U3GS) 
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Quadrangle  maps  and  some  field  slope  measurements  provided 
the  measurements  of  the  watershed  topographic  variables. 

The  U„S.  Soil  Conservation  Service  (SCS)  performed  surveys 
and  prepared  maps  showing  the  types  and  proportions  of 
the  soils  on  the  watersheds.  Measurements  of  infiltra¬ 
tion  rates  for  most  of  the  soils  were  made  by  project 
personnel  during  the  summer  of  1966.  The  USGS  obtained 
cross-section  and  velocity  measurements  near  the  water- 
stage  recorders  for  some  of  the  runoff  events,  yielding 
data  for  the  construction  of  stage-discharge  curves. 

Aerial  photographs  of  the  watersheds  provided  information 
about  the  land  use  and  snow  coverage. 

Data,  from  the  continuous  water-stage  recorders  were 
reduced  to  hydrographs  relating  discharge  rate  to  time, 
using  the  stage-discharge  relationships  for  each  recorder. 
These  hydrographs  provided  the  criterion  for  selection  of 
the  runoff  events  to  be  used.  Whenever  the  peak  discharge 
rate  was  larger  than  10  cfs,  a  starting  and  an  ending  time 
of  discharge  were  established,  giving  the  runoff  hydrograph 
for  that  event.  Generally,  there  was  no  base  flow  before 
the  event,  and  the  hydrograph  was  terminated  when  the  dis¬ 
charge  rate  reduced  to  zero  flow.  This  procedure  estab¬ 
lished  the  50  peak  discharge  rates,  and  the  measurements 
of  the  second  dependent  variable,  total  runoff,  were 
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determined  from  the  area  under  each  hydrograph. 

SELECT  I  OPT  OF  INDEPENDENT  VARIABLES 

Measurements  of  29  independent  variables  for  each 
runoff  event  were  computed.  The  variables  are  presented 
in  Table  II,  and  this  is  followed  by  the  reasons  for  the 


Table  II.  Independent  Variables  Studied 


Var . 
No . 

Var. 

Name 

1 

A 

2 

SHP 

3 

AZ 

4 

ELEV 

5 

GILDS 

6 

GNDL 

7 

FREQ 

8 

L 

9 

S 

10 

USE 

11 

INFR 

12 

POND 

13 

I 

14 

ISD 

15 

D 

16 

TDF 

17 

TPCP 

18 

API 

19 

30LM 

20 

WDIR 

21 

WEEK 

22 

AIRT 

23 

AT3D 

24 

WVEL 

25 

OTSD 

2  6 

SOLT 

27 

STSD 

28 

DEGD 

29 

SVEQ 

Watershed  and  Storm  Variable  Definitions 
_ _ _  &  Units  of  Measurement _ _ __ _ 

Area  (sq  mi) 

Shape  (dimensionless ) 

Azimuth  (deg) 

Elevation  (ft) 

Ground  slope  (ft/ft) 

Overland  distance  of  flow  (mi) 

Stream  frequency  (1/sq  mi) 

Main  channel  meander  length  (mi) 

Main  channel  straight-line  slope  (ft/ft) 

Land  use  ratio  ( dimensionless ) 

Soil  infiltration  rate  (in. /hr) 

Per  cent  ponds  and  reservoirs  {%) 

Precipitation  intensity  (in. /hr) 

Standard  deviation  of  intensities  (in. /hr) 
Duration  of  storm  (hr) 

Time  distribution  of  precipitation  (hr) 

Total  precipitation  (in.) 

14-day  antecedent  precipitation  index  (in.) 

Soil  moisture  (/) 

Wind  direction  (dimensionless ) 

Week  of  the  year  (dimensionless) 

Mean  air  temperature  (deg  F) 

Standard  deviation  of  air  temperatures  (deg  F) 
Mean  wind  velocity  (mph) 

Standard  deviation  of  wind  velocities  (mph) 
Mean  soil  temperature  (deg  F) 

Standard  deviation  of  soil  temperatures  (deg  F) 
Degree -Days  (deg  F) 

Snow-water  equivalent  in  snowpack  (in.) 
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selection  of  these  particular  variables.  Detailed  descrip¬ 
tions  of  the  variables  are  listed  in  Appendix  Ce 

A  few  comments  concerning  the  detailed  descrip- 
tions  of  the  variables  should  be  made  before  proceeding 
with  the  discussion  of  the  analysis.  The  Thiessen  areas 
for  the  various  types  of  "weighting”  were  determined  from 
repeated  planimetering  of  reduced  SC3  soil  maps  of  the 
watersheds.  The  area  attributed  to  each  weather  station, 
snow  course,  or  precipitation  station  was  determined  by 
drawing  perpendicular  bisectors  of  the  lines  connecting 
the  stations.  The  bisectors  were  intersected  with  each 
other  and  with  the  watershed  boundaries,  forming  the  poly¬ 
gon  for  each  station. 

Because  contoured  topographic  maps  with  small  contour 
intervals  were  not  available  for  the  Duck  Creek  and  East 
Fork  of  Duck  Creek  watersheds,  nor  for  the  Bacon  Creek 
watershed,  transit- stadia  surveys  of  the  main  channels 
of  Duck  Creek  and  East  Fork  of  Duck  Creek  were  performed 
to  establish  the  vertical  properties  of  these  watershed 
channels.  The  slopes  obtained  from  the  survey  varied  by 
less  than  10  per  cent  from  the  slopes  obtained  from  the 
contoured  U3G3  maps.  A  careful  analysis  of  these  maps 
will  therefore  yield  satisfactory  values  for  the  slope 


variables,  GrTDS  and  3.  For  this  reason,  a  survey  of  the 


I 
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Bacon  Creek  watershed  was  not  conducted,  and  a  similar 
large- interval  map  was  used. 

The  variables  measuring  ground  slope  and  overland 
distance  of  flow  were  measured  in  the  manner  chosen  be¬ 
cause  of  the  difficulty  of  other  methods  of  representing 
these  parameters.  Some  investigators  use  the  method  em¬ 
ployed  here,  while  others  measure  a  random  number  of 
distances  and  slopes  from  ridges  to  channels,  and  average 
these.  Because  the  steepest  portions  of  watersheds  are 
generally  in  the  upper  reaches,  the  method  employed 
attempts  to  establish  the  steepest  and  shortest  path  for 
the  overland  flow. 

Several  methods  of  measuring  the  population  of  stream 
segments  on  a  watershed  can  be  found  in  the  literature 
(Chow,  1964).  The  method  chosen  represents  the  number 
of  segments  per  unit  area  of  the  watershed,  and  there¬ 
fore  has  a  physical  meaning.  A  large  value  for  this  var¬ 
iable  would  indicate  that  the  watershed  exhibits  a  large 
number  of  branching  streams  and  tributaries. 

Because  aerial  photographs  were  not  available  for 
each  year  of  data  recorded,  the  land-use  variable  remained 
constant  for  each  runoff  event.  Other  forms  for  this 
variable  have  been  suggested,  but  the  available  data  was 
not  sufficient  to  warrant  the  use  of  any  of  them.  The 
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index  chosen  increases  with  the  vegetation  on  the  water¬ 


shed,  and  therefore  has  a  physical  interpretation. 

The  infiltration  variable  was  chosen  to  measure  an 
average  infiltration  rate  for  the  different  soil  types. 

A  single  number  for  each  watershed  is  not  altogether 
reasonable  if  many  soil  types  are  present,  but  the  agree¬ 
ment  of  the  average  infiltration  rates  on  each  of  the 
three  SCS  soil  types  with  published  data  was  encouraging. 
Table  III  compares  the  measured  values  with  the  suggested 
ranges  from  the  literature. 

Table  III.  Measured  and  Published  SCS  Infiltration  Rates 


SCS  Soil  Type 


0.129 

0.094 

0.040 


0.15  to  0.30 
0.05  to  0.15 
0.00  to  0.05 


B 

c 

D 


*From  Chow  (1964)  p.  12-26. 

The  pondage  variable  was  not  easily  obtained,  and 
may  not  be  indicative  of  the  storage  which  takes  place 
during  a  runoff  event.  The  purpose  of  using  this  var¬ 
iable  was  to  indics.te  the  possible  storage  capacity  of 
the  watersheds.  Although  the  depths  of  the  reservoirs 


are  not  included  in  the  variable,  the  surface  area  is 
frequently  found  to  be  related  to  depth,  and  both  may 
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be  measured  in  the  same  variable. 

The  remaining  variables  were  chosen  because  of  their 
frequent  use  in  similar  studies e  The  de termination  of  values 
of  all  the  variables  was  time-consuming  because  computations 
of  some  of  the  variables  required  analyses  of  large  amounts 
of  data.  The  development  of  good  prediction  equations 
usually  requires  easily-obtained  variables,  and  this  fact 
was  also  considered  when  choosing  the  variables.  Table  IV 


Table  IV.  Summary  of  Watershed  Characteristics 


Variable 

Lone  Nan 
Coulee 

Bacon 

Creek 

Hump 

Creek 

Duck 

Creek 

E . F . Duck 
Creek 

A  ( sq  mi ) 

14.1 

17.97 

7.61 

53.79 

13.67 

SHP  (mi/mi) 

2.448 

4.332 

2.569 

2.721 

3.684 

AZ 

258 

289 

177 

164 

138 

ELSV  (ft) 

3916 

4578 

4144 

2853 

2862 

GND3  (ft/ft) 

0.0194 

0.0162 

0.0650 

0.0596 

0.0763 

GNDL  (mi) 

0.878 

0.516 

0.239 

0.820 

0.380 

FREQ  ( 1/ sq  mi 

)  5.46 

26.66 

24.18 

14.05 

15.80 

L  (mi ) 

8.446 

12.895 

6.444 

23.614 

13.051 

S  (ft/ft) 

0.00639 

0.01182 

0.02412 

0.00596 

0o00692 

USE  {%) 

1.35 

75.18 

45.64 

9.67 

10.51 

INFR  (in/hr) 

0.09481 

0.08824 

0.10155 

0.10022 

0.09689 

POND  {%) 

0.02352 

0.00915 

0.00000 

0.00614 

0.02887 

_ 
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summarizes  the  physiographic  variables,  which  were  assumed 
to  remain  constant  for  each  watershed.  The  values  of  the 
storm  variables  are  not  tabulated  herein  but  are  available 
in  the  Department  of  Civil  Engineering  and  Engineering 
Mechanics,  Montana  State  University.  The  means  and  standard 
deviations  of  the  storm  and  physiographic  variables  are 
listed  in  Appendix  H  for  the  raw  and  log- transformed  vari¬ 
ables  . 


TREATMENT  0?  MISSING  DATA 


Operational  and  mechanical  problems  with  the  instru¬ 
ments  during  the  times  of  interest  caused  some  missing 
values  for  the  storm  variables.  At  least  one  recording 
precipitation  station  was  operational  during  each  of  the 
storms,  and  hence  no  information  was  lost  for  the  precip¬ 
itation  variables.  Thiessen  areas  for  only  those  stations 
which  were  recording  were  used  whenever  some  stations 
were  non-operative.  The  greatest  losses  of  complete  data 
occurred  from  the  weather  stations.  Whenever  data  was 
missing  from  all  the  weather  stations  on  a  watershed, 
the  average  of  the  variable  computed  for  the  same  time 
period  over  the  other  years  of  data  was  used.  The  one 
exception  to  this  was  the  variable  for  wind  direction. 
Here,  the  time  difference  of  appearance  of  precipitation 
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on  the  recording  precipitation  stations  was  used  to  esti¬ 
mate  the  direction  of  movement  of  the  storm,  and  there¬ 
fore  the  probable  wind  direction. 

The  University  computers  were  used  extensively  in 
the  reduction  of  the  data-  for  the  value  of  each  storm 
variable  for  all  the  storms.  Most  of  the  programs  writ¬ 
ten  were  elementary  and  are  not  included  with  this  report. 
One  program  for  computation  of  variables  13,  14,  15,  16, 
and  17  in  Table  II  is  rather  involved,  and  is  included 
in  Appendix  D.  The  standard  Weather  Bureau  format  for 
recording  and  non-recording  precipitation  data  was  used 
in  punching  the  project  data,  but  operations  on  this  for¬ 
mat  were  found  to  be  difficult.  The  included  program 
could  probably  have  been  much  shorter  if  another  input 
format  had  been  used. 

ANALYSIS  OF  31  VARIABLES 
Correlations 

After  the  values  of  the  29  independent  variables 
and  2  dependent  variables  were  computed  for  the  50  runoff 
events,  the  data  were  transformed  by  taking  the  base-10 
logarithms  of  all  the  values.  This  was  done  in  order  to 
use  Equation  (3)  ns  the  model  for  the  prediction  equation. 
Zero  measurements  of  any  of  the  variables  were  set  equal 
to  0,00001  before  transformation,  in  order  to  compute 
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logarithms  with  the  computer.  The  correlation  coefficients 
of  each  of  the  transformed  variables  with  all  other  variab¬ 
les  were  then  computed  using  a  modified  version  of  a  cor¬ 
relation  computer  program  from  Cooley  and  Lohnes  (Appendix 
E).  The  means  and  standard  deviations  of  all  log-trans¬ 
formed  variables  and  of  the  original  variables  were  also 
computed,  and  these  are  listed  in  Appendix  H.  Computations 
for  the  eigenvectors  and  their  respective  eigenvalues 
required  the  correlations  between  the  standardized,  trans¬ 
formed  variables  (Equations  9  and  10).  However,  since 
these  correlations  are  identical  to  the  values  obtained 
for  the  log- transformed  variables,  as  stated  earlier, 
standardization  was  not  employed  at  this  point  of  the 
analysis . 

Principal  Component  Analysis 

The  correlation  coefficients  for  the  29  independent 
variables  were  entered  into  a  modified  version  of  a  prin¬ 
cipal  component  computer  program  given  by  Cooley  and 
Lohnes  (Appendix  F).  This  program  computed  the  eigenvalues 
and  normalized  eigenvectors  of  the  correlation  matrix, 
giving  29  principal  components  and  their  respective  vari¬ 
ances.  Table  V  lists  the  variance,  percentage  of  total 
variance,  and  accumulated  percentage  of  the  first  18 

Equation  (14)  was  used  in  computing  the  per 
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Table  V.  Properties  of  the  First  18  Components 


Component 

Number 

Variance 
(Eigenvalue ) 

Per  Cent  of 
Total  Variance 

Accumulated 

Per  Cent 

1 

9.625 

33.19 

33.19 

2 

4.772 

16. 46 

49 . 65 

3 

3.778 

13.03 

62 . 68 

4 

2.323 

8.01 

70.69 

5 

1.730 

5.07 

76.66 

6 

1.385 

4.77 

81. 43 

7 

1.025 

3.53 

84 . 96 

8 

0.961 

3.31 

88.27 

9 

0 . 806 

2.78 

91.05 

10 

0.684 

2.36 

93.41 

11 

0.430 

1.48 

94.89 

12 

0.410 

1.41 

96.30 

13 

0.278 

0.96 

97.26 

14 

0.214 

0.74 

98.00 

15 

0.180 

0.62 

98.62 

16 

0.140 

0.48 

99.10 

17 

0.114 

0.39 

99.49 

18 

0.072 

0.25 

99.74 

cent  of  the 

total  variance 

accounted  for  by 

each  component 

where  p  was  the  number  of  independent  variables.  The 
other  eleven  components  contributed  about  equally  to  the 
small  per  cent  remaining,  and  were  considered  unimportant. 

The  criteria  of  Siselstein,  Cooley  and  Lohnes,  and 
Kendall  were  combined  in  determining  which  components  to 
retain  for  further  analyses.  If  only  those  components 
whose  eigenvalues  are  greater  than  unity  (Cooley  and 
Lohnes)  were  used,  then  only  about  85  per  cent  of  the 
information  in  the  measurements  would  be  retained  by  seven 

There  were  no  obvious  "large”  changes  In  the 


components . 
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values  (Kendall),  and  twelve  components  would  have  to  be 
retained  if  those  components  having  less  than  one  per  cent 
of  the  variance  were  discarded.  The  author  decided  to  re¬ 
tain  the  first  ten  components,  losing  about  7  Pe^  cent  of 
the  information. 

Table  VI  shows  that  none  of  the  29  independent 
Table  VI.  Reduced  Loadings  for  the  First  10  Components 


Component 


Variable 

1 

2 

2 

4 

5 

6 

.  i 

8 

9 

10 

A 

.11 

- .  l6 

.30 

.01 

.25 

-.39 

-.29 

-.01 

-.08 

.01 

SEP 

.  16 

-.08 

-.15 

- .  46 

-.24 

.00 

-.03 

-.08 

.13 

- .  06 

AZ 

-.19 

.33 

-.06 

- ,  06 

.07 

-.19 

.10 

-.00 

.12 

.12 

ELEV 

- .  16 

.36 

-.15 

-.05 

.04 

-.14 

.05 

-.02 

.09 

.15 

GNDS 

.20 

-.32 

-.10 

.10 

-.02 

.15 

-.07 

.00 

-.14 

-.10 

GNDL 

-.15 

.09 

.40 

.08 

.14 

-.20 

.15 

.04 

.04 

.03 

FREQ 

.23 

-.12 

-.27 

-.20 

.04 

-.18 

.10 

-.08 

-.04 

.02 

L 

.18 

-.25 

.21 

-.12 

.12 

-.28 

.21 

-.03 

-.05 

-.04 

S 

.  06 

.14 

- .  46 

.03 

.11 

-.11 

.04 

-.05 

-.07 

.13 

USE 

.21 

-.06 

-.29 

-  „  22 

.06 

-.24 

.14 

-.08 

-.03 

.05 

INFR 

.10 

-.26 

-.01 

.44 

.20 

.08 

-.11 

.05 

- .  26 

-.04 

POND 

-.07 

-.05 

.37 

-.29 

-.27 

.14 

.09 

.01 

.20 

-.13 

I 

.26 

.19 

.06 

-.02 

.00 

.04 

.14 

.20 

-.01 

.05 

ISD 

.27 

.18 

.08 

.00 

.03 

.05 

.11 

.17 

.00 

-.09 

D 

-.18 

-.15 

-.02 

-.13 

.36 

.12 

.07 

-.31 

.  16 

-.05 

TDF 

.06 

-.03 

.08 

.09 

.02 

-.50 

-.63 

-.28 

.11 

- .  16 

TP  CP 

.19 

.16 

.03 

-.30 

.36 

.30 

.16 

-.18 

.15 

-.05 

API 

.19 

.04 

.  06 

.11 

.23 

.29 

-.20 

-.33 

.19 

-.18 

SOLM 

-.09 

.23 

-.14 

.00 

.44 

-.09 

-.25 

.16 

.08 

-.11 

WDIR 

.08 

-.22 

.10 

-.09 

.16 

.05 

-.35 

.20 

.05 

.77 

WEEK 

.27 

.17 

.10 

.03 

.01 

.03 

- .  06 

-.01 

.19 

-.01 

AIRT 

.22 

.21 

.11 

-.04 

-.07 

.07 

-.27 

.12 

-.21 

.  06 

ATSD 

-.11 

-.12 

-.07 

-.18 

.30 

.09 

-.12 

.  6l 

.07 

-.25 

WVEL 

-.17 

.05 

.14 

-.25 

.15 

.23 

-.01 

-.35 

-.42 

.26 

WVSD 

-.13 

.12 

.06 

-.34 

.12 

-.02 

-.16 

.04 

-.58 

- .  2  6 

SOLT 

.28 

.18 

.10 

.03 

-.05 

.00 

-.03 

.03 

-.05 

.03 

ST3D 

.18 

-.13 

.09 

-.34 

.21 

.03 

-.03 

.05 

.21 

-.00 

DSGD 

.26 

.  16 

.07 

.02 

.00 

-.05 

-.05 

-.02 

-.25 

-.12 

3WSQ 

-.25 

-.20 

-.11 

-.01 

-.02 

-.03 

-.03 

.05 

.03 

-.12 
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variables  had  small  correlations  with  all  10  of  the  impor¬ 
tant  components.  The  reduced  values  in  the  rows  of  the 
table  are  the  correlations  of  the  variables  with  the  com- 
ponents.  The  numbered  components  correspond  to  the  first 
ten  components  of  Table  V,  which  gives  the  respective 
variance  of  each.  Because  no  information  about  the  inde¬ 
pendent  variables  could  be  obtained  at  this  point,  the 
components  were  reduced  to  factors  via  Equation  (12),  in 
preparation  for  rotation. 

Varlnax  Rotation  of  Principal  Factors 

An  initial  varimax  rotation  of  the  ten  factors  was 
performed  using  a  modified  version  of  a  computer  program 
listed  by  Cooley  and  Lohnes  (Appendix  G).  The  program 
was  modified  to  read  either  components  and  eigenvalues, 
or  factors.  Computations  for  the  variance  of  each  rotated 
factor  were  not  initially  made  by  the  program,  and  state¬ 
ments  for  the  computations  and  output  were  included.  The 
program  was  tested  with  data  from  both  Eiselstein  and 
Harman,  and  the  results  agreed  very  well  with  those  obtain¬ 
ed  by  both  investigators. 

The  ten  factors  were  rotated  by  the  program,  giving 
the  new  factors  shown  in  Table  VII.  The  reduced  variance, 
per  cent  of  original  variance,  and  accumulated  percentage 
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Table 

VII. 

Properties  of 

the  : 

10  Rota. ted 

Factors 

Factor 

1 

2 

3 

1 

!  ^  1 

i  j 

5 

/ 

0 

7 

8 

9 

10 

Variance 

7.3 

4.3 

3.8 

2.8 

2.4 

1 . 6 

1.4 

1.2 

1.2 

1.1 

Per  Cent 

25.2 

14.7 

13.1 

9.5 

8.3 

5.5 

5.2 

4.2 

4.0 

3.7 

Ac cum.  % 

25.2 

39.9  53. 0 

62,5 

70.8 

76.3  31.5  85.7  89.7 

9J.4 

A 

.14 

-.20 

.08 

.11 

~*9> 

.02 

.02 

—  •  02 

-.11 

.09 

SHP 

.18 

-.09 

-.94 

.22 

.08 

-.02 

-.02 

-.02 

.01 

.04 

AZ 

-.08 

.91 

.35 

-.03 

.04 

-.14 

-.06 

.00 

.02 

-.11 

ELEV 

-.06 

•82 

.14 

-.31 

.30 

-.09 

-.05 

.01 

.04 

-.14 

GNDS 

.09 

-.90 

-.35 

-.11 

-.07 

.16 

.  06 

-.00 

-.02 

.11 

GNDL 

-.09 

.40 

.  68 

.35 

-.48 

-.11 

-.03 

-.00 

-.04 

-.00 

FREQ 

.24 

-.28 

-.82 

-.37 

-.17 

.09 

.02 

-.02 

-.03 

.04 

L 

.19 

-.41 

-.29 

.19 

-.30 

.04 

.03 

-.02 

-.09 

.12 

S 

.09 

.17 

-.40 

-.82 

.32 

•  08 

.01 

-.00 

.03 

-.08 

USE 

.25 

-.14 

-.82 

-.44 

-.18 

.08 

.02 

-.02 

-.03 

.02 

INFR 

-.01 

-.83 

.31 

-.35 

-.21 

.18 

.08 

.01 

-.04 

.08 

POND 

- .  06 

.11 

.04  .97 

-.08 

-.15 

-.04 

-.00 

.00 

.05 

I 

.91 

.00 

-.17 

-.07 

-.11 

.12 

.07 

.02 

.15 

.01 

ISD 

.91 

-.06 

-.14 

-.03 

-.14 

.12 

.13 

.05 

.09 

- .  06 

D 

zjJA 

.08 

.03 

.01 

-.13 

-.21 

.45 

.12 

.06 

.05 

TDF 

.09 

-.04 

-.01 

-.11 

-.15 

.06 

-.01 

-.08 

-•9,5 

.  06 

TP  CP 

.57 

.02 

-.03 

-.15 

-.10 

-.02 

.69 

.03 

.21 

-.00 

API 

.47 

-.29 

-.01 

-.02 

.05 

.06 

.68 

-.07 

-.17 

.01 

SOLM 

-.02 

.51 

.19 

-.44 

.16 

-.14 

.23 

.48 

-.19 

-.01 

WDIR 

.02 

-.31 

-.10 

.09 

-.17 

.01 

.01 

.09 

-.07 

.  91 

WEEK 

.87 

-.02 

-.14 

.03 

-.07 

.21 

.26 

-.08 

-.10 

.04 

AIRT 

.99 

-.03 

-.05 

.  06 

.11 

-.14 

-.00 

-.01 

-.13 

.12 

ATSD 

-.36 

-.02 

-.03 

.02 

-.00 

-.08 

-.03 

.84 

.13 

.10 

WVEL 

-.30 

.26 

.17 

.18 

.02 

-.76 

.18 

-.18 

.18 

.16 

WVSD 

-.09 

.31 

-.01 

.12 

.02 

-.82 

-.13 

.28 

-.04 

-.14 

SOL'T 

-.08 

-.13 

-.02 

-.08 

.09 

.10 

-.12 

- .  06 

.02 

STSD 

.27 

-.18 

-.53 

.24 

-.37 

-.02 

.32 

.25 

-.01 

.28 

DEGD 

.  86 

-.14 

-.13 

-.12 

-.12 

-.08 

.07 

-.09 

-.13 

-.09 

SWEQ 

-.88 

-.02 

.10 

.11 

.11 

-.04 

-.18 

.18 

-.00 

-  .  06 

of  each  factor  is  included  in  the  table,  showing  that  the 
ten  rotated  factors  account  for  all  the  variance  in  the 
ten  components,  as  expected.  The  loadings  of  the  factors 
are  the  correlations  of  each  variable  with  each  factor, 
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and  the  largest  correlation  for  each  variable  is  underlined. 
As  shown,  all  the  variables  had  a  high  correlation  with 
only  one  factor. 

Prom  the  loadings  and  the  criteria  in  the  literature, 
the  variables  were  examined  for  their  importance  to  the 
rotated  factors.  As  with  the  choice  of  the  important 
components,  there  are  no  analytical  means  of  determining 
which  variables  are  important,  or  which  factors  could  be 
omitted.  The  literature  stresses  the  fact  that  the  impor¬ 
tant  variables  should  be  chosen  with  "one  hand  over  their 
names”,  so  that  the  investigator  does  not  force  the  data 
to  indicate  a  wrong  interpretation.  Investigators  of 
multivariate  analyses  tend  to  agree  that  no  more  than  two 
variables  per  factor  should  be  retained,  and  that  loadings 
smaller  than  about  0.40  can  usually  be  considered  to  be 
indicative  of  an  unimportant  correlation  (Wallis,  1 965). 

This  value  should  only  be  considered  as  a  guide  and  not 
as  the  accented  criterion  for  unimportance . 

An  investigation  of  Table  VII  yields  several  possible 
interpre tations ,  even  though  the  simple  structure  is 
relatively  clear  because  of  the  single,  large  loading 
on  each  of  the  variables.  Seven  of  the  variables  have 
comparatively  small  correlations  with  all  the  factors, 
and  five  of  the  remaining  variables  are  highly  correlated 
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with  factors  whose  variances  are  small.  The  variables 
which  are  not  highly  correlated  with  any  of  the  factors, 
GNDL,  D,  TPCP,  API,  SOLM,  and-STSD,  do  not  contribute 
appreciably  to  the  information  contained  by  the  measure¬ 
ments  of  the  original  29  independent  variables.  Because 
the  rotated  factors  "reproduce’'  93*^  per  cent  of  the  orig¬ 
inal  information,  omission  of  the  seven  variables  listed 
would  give  approximately  the  same  information  contained 
by  measurements  of  all  the  variables. 

Also,  because  of  the  small  variances  of  the  last  four 
or  five  factors,  these  factors  do  not  reproduce  a  large 
portion  of  the  original  information.  The  last  three  are 
determined  by  essentially  one  variable  each,  AT8D,  TDF, 
and  WDIR,  respectively.  These  three  variables  are  highly 
correlated  with  the  last  three  factors,  but  the  factors 
are  not  highly  important  to  the  total  variance,  and  the 
variables  were  therefore  deemed  to  be  relatively  unimpor¬ 
tant  . 

Two  other  variables,  L  and  WVSD,  were  important, 
respectively,  to  the  fifth  and  sixth  factors.  The  sixth 
factor  was  classified  with  the  last  three,  and  the  variable 
WVSD  was  also  deemed  to  be  unimportant.  The  relatively 
large  change  in  variance  between  the  fifth  and  sixth  fac¬ 
tors,  when  compared  to  the  other  changes,  seems  to  justify 
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the  single  classification  of  the  last  five  factors,  i.e., 
not  nearly  as  important  as  the  first  five. 

The  variables  L  and  A  are  the  only  two  variables 
important  to  the  fifth  factor.  Because  variables  which 
"coexist”  within  a  factor  are  generally  highly  correlated, 
then  the  exclusion  of  one  yields  essentially  the  same 
information  obtained  when  both  are  measured.  Because  the 
variable  A  was  considerably  more  highly  correlated  with 
this  factor,  and  because  the  factor  was  the  least  important 
of  the  first  five,  L  was  also  deemed  unimportant. 

The  interpretation  given  in  the  preceeding  paragraph 
excludes  12  independent  variables.  An  investigation  of 
the  sixth  and  seventh  factors  yields  the  fact  that  all 
the  variables  correlated  with  these  factors  have  been 
excluded,  and  the  variances  of  the  factors  are  essentially 
equal.  Wallis  (1965)  warns  against  omitting  all  the  vari¬ 
ables  important  to  the  important  factors,  and  for  this 
reason,  the  "hand”  was  lifted  and  the  variable  TPCP  in 
the  seventh  factor  was  not  omitted  in  further  analyses. 
Objectivity,  and  the  frequency  of  appearance  of  this  vari¬ 
able  in  other  investigations  prohibited  its  exclusion  here. 
This  essentially  means  that  the  first  six  factors  were 
deemed  important,  if  the  sixth  and  seventh  factors  in 
Table  VII  are  reversed.  This  interpretation  left  only  18 
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independent  variables  for  further  analysis. 

Because  a  regression  equation  involving  18  independent 
variables  is  probably  not  practical,  two  attempts  to  fur¬ 
ther  reduce  the  number  of  variables  were  made.  The  first 
involved  the  use  of  only  those  variables,  again  with  the 
exception  of  TPCP,  which  displayed  correlations  higher 
than  an  arbitrary  value  of  0.85  with  the  important  factors. 
The  second  attempt  took  advantage  of  the  facts  that  the 
’'coexistent11  variables  within  a  factor  are  usually  highly 
correlated,  and  that  the  factors  themselves  are  uncorrelat¬ 
ed.  One  variable  from  each  of  the  first  six  factors, 
using  the  reversed  notation  on  factors  6  and  7  in  Table  VII, 
was  retained  to  yield  six  relatively  uncorrelated  variables. 
In  the  first  factor,  30LT  had  the  highest  correlation; 
however,  for  reasons  discussed  later,  the  author  believed 
that  the  variable  having  the  second  highest  correlation, 

I,  would  be  a  better  variable  to  measure  the  information 
contained  by  the  first  factor.  The  variables  having  the 
highest  correlations  with  the  next  five  factors  were  also 
included,  yielding  the  six  most  uncorrelated  variables. 

Table  VIII  shows  the  three  interpretations  of  the  rotated 
factors.  Other  interpretations  were  possible,  but  the 
author  felt  that  these  would  be  adequate  for  the  regression 
analysis  to  follow.  The  second  and  third  are  meant  only 
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to  possibly  reduce  the  number  of  independent  variables, 
and  the  first  is  felt  to  be  the  most  appropriate  interpre¬ 
tation  of  the  rotated  factors. 

Table  VIII .Three  Sets  of  Important  Variables  from  Three 

Interpretations  of  Table  VII 


First 

Interpretation 


Second 

Interpretation 


Third 

Interpre tation 


AAA 


SHP 

SHP 

SHP 

A Z 

AZ 

ELEV 

ELEV 

GNDS 

GNDS 

GNDS 

FREQ 

S 

USE 

INFR 

POND 

POND 

POND 

I 

I 

I 

ISD 

ISD 

TP  CP 

TP  CP 

TP  CP 

WEEK 

WEEK 

AIRT 

AIRT 

SOLT 

SOLT 

DSGD 

DSGD 

SWEQ. 

SWEQ 

18 

14 

6 

If  an  interpretation  of  the  most  important  successive 
variables  to  the  information  were  desired,  Table  IX  would 
be  the  best  approach.  As  shown,  the  29  independent  vari¬ 
ables  are  all  present.  The  criterion  used  in  formulating 
this  table  was  the  decreasing  order  of  correlations  of 
the  variables  with  the  factors  from  Table  VII,  with  the 
exception  of  the  first  factor,  where  SOLT  was  placed  third 
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in  line  instead  of  first. 


Table  IX. 

Succ 

essive 

Impor 
to  th 

tance  of  the 
e  Factors 

Variables 

• 

Factor 

Decreasing  Order  of 

Importance 

1 

I 

ISD 

SOLT 

AIRT 

SWSQ 

WEEK 

DEGD  D 

2 

AZ 

GNDS 

ELEV 

INFR 

SOLE 

3 

SEP 

USE 

FREQ 

GNDL 

STSD 

4 

POND 

S 

5 

A 

L 

6 

WV3D 

WVEL 

7 

TP  CP 

API 

8  xiTSD 

9  TDF 
10  WDIR 

After  the  interpretations  of  the  initial  rotation, 
the  ten  original,  unrotated  factors  were  rotated  three 
additional  times,  using  only  the  loadings  on  the  important 
variables  for  each  of  the  three  interpretations.  This  was 
done  to  refine  the  rotated  loadings  on  the  important  vari¬ 
ables.  In  each  of  the  three  rotations,  the  coefficients 
did  not  appreciably  change,  but  the  new  rotated  factors 
were  computed  to  remove  any  effects  of  the  unimportant 
variables  in  each  case.  All  ten  of  the  original  factors 
were  rotated  to  maintain  a  relatively  high  percentage  of 
explained  variance.  If  only  the  first  six  factors  had 
been  rotated,  less  than  76  per  cent  of  the  information 
would  have  been  retained  (see  Table  VII).  The  reduction 
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in  the  number  of  variables  used  in  the  rotation  causes 
a  loss  of  information,  but  this  is  not  nearly  as  much  as 
would  have  been  lost  if  combined  with  a  further  reduction 
in  the  number  of  factors.  Table  X  outlines  the  retained 
information  for  the  three  rotations.  The  percentages  given 
are  for  the  original  variance  in  29  variables,  and  are 
therefore  small.  However,  the  retained  variance  of  the 
important  variables  is  large  in  each  case. 

Table  X  .  Variance  Retained  by  Rotations  of  the  Original 

Factors  with  Reduced  Numbers  of  Variables 


interpretation 

Variables 

Total  Variance 
of  10  Factors 

%  of  Original 
Variance 

1 

18 

17.14 

88.9 

2 

14 

13.18 

87.9 

3 

6 

5.57 

86 . 6 

Regression  Analyses 

Four  prediction  equations  having  the  form  of  Equation 
(3)  were  obtained.  The  first  used  the  modified  principal 
component  method,  and  the  other  three  used  computed  values 
of  the  rotated  factors  from  the  original  data. 

Equation  (21)  was  used  to  find  the  regression  coef¬ 
ficients  for  the  ten  important  principal  components,  using 
the  six-place  values  of  the  loadings  and  eigenvalues  listed 
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in  Tables  V  and  VI ,  The  computations  of  Equation  (21) 


P 


k=l 


X  1kj  rky 


(21) 


and  the  procedure  of  Equations  (17)  through  (20)  were 
done  by  a  short  computer  program  written  for  this  purpose, 
giving  the  coefficients  of  the  measured  independent  vari¬ 
ables.  These  are  given  in  Table  XI  at  the  end  of  this 
chapter,  along  with  the  results  of  the  other  regression 
analyses.  Multiplicative  equations  for  both  dependent 
variables  may  be  obtained  from  this  table  and  Equation  (2) 


Y  =  Bq  XlBl  X2B2  X3B3 


(2) 


where  is  the  antilogarithm  of  the  constant  of  regression. 

An  ordinary  linear  multiple  regression  analysis  was 
performed  for  each  of  the  two  dependent  variables  and  the 
ten  rotated  factors  for  each  of  the  three  interpreta¬ 
tions  of  the  initial  rotation.  A  computer  program,  "One- 
Pass  Multiple  Regression",  written  by  the  Computing  Center 
Staff  for  general  use,  was  employed.  Input  to  this  program 
for  each  analysis  was  50  observations  of  each  dependent 
variable  and  50  computed  values  of  ten  factors,  because  the 
regression  was  to  be  performed  on  the  factors  and  not  on 
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the  important  variables.  The  50  values  of  the  10  factors 
for  each  regression  were  computed  and  punched  by  a  program 
written  to  comply  with  Equation  (22),  The  xv.  values  in 

XI 

this  equation  are  observations  of  the  standardized  form  of 
the  important,  log- transformed  variables.  The  standardized 
values  were  computed  from  Equation  (19)  for  all  50  observa¬ 
tions,  and  only  the  measurements  of  the  18,  14,  or  6 
important  variables  were  used  in  the  computations  for  the 
values  of  the  factors. 

The  results  of  the  regressions  were  the  coefficients 
for  the  rota, ted  factors,  G  and  G.,  which  had  to  be  reduced 
to  coefficients  for  the  important,  log- transformed  independ¬ 
ent  variables.  The  values  of  Gq  and  G  were  substituted 

J 

into  Equations  (23),  and  a  computer  program  for  these 


computations  yielded  the  values  of  3,  and  3n  for  Equation 

X  U 

(2),  the  desired  multiplicative  form  of  the  prediction  equa¬ 


tion.  Table  XI  summarizes  the  results  of  all  eight  regres¬ 
sion  analyses.  The  anti logarithms  of  the  constants  are  giv¬ 
en,  and  any  of  the  prediction  equations  may  be  obtained  from 
Equation  (2),  where  the  X*s  are  the  field  measurements  of 
the  variables,  is  the  anti logarithm  of  the  regression 
constant,  and  the  are  the  coefficients  listed.  For 
example,  the  equation  for  the  peak  discharge  rate  from  the 
second  factor  regression  is  found  from  Table  XI  and 
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Equation  (2)  to  be 


QMAX  =  I.476806 


.263100  .628922  .318249  .091439 

A _  AZ  SLEV  POND 

.387924"  ~3o3629  .17982  5 

SHP  GNDS  ISD 


1 


.I51434 


_  .068278  .660111  .382531  .432310 

TPCP  WEEK  AIRT  DSGD 


SOLT 


SWEQ 


.302547 


.078893 


The  above  equation  was  obtained  from  the  fifth  column  of 
Table  XI,  and  an  equation  for  the  runoff  volume  nay  be 
obtained  in  a  similar  fashion  from  the  sixth  column. 

The  units  for  OHAX  in  the  above  equation  and  in  the  other 
equations  for  yNAX  are  cubic  feet  per  second,  and  the  units 
for  RUNF  are  inches,  providing  that  the  units  of  the  inde¬ 
pendent  variables  are  the  same  as  the  units  used  in  Table  II. 
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Chapter  V 

DISCUSSION  OF  RESULTS 

The  methods  employed  and  the  results  obtained  in  the 
previous  chapter  indicate  that  the  multivariate  procedures 
provide  a  statistical  means  of  determining  the  variables 
which  are  more  important  to  runoff  than  other  variables. 

A  discussion  of  the  variables  and  the  prediction  equations 
involving  the  variables  follows. 

VARIABLES  IMPORTANT  TO  RUNOFF 

The  variables  which  were  selected  for  this  study  are 
believed  to  have  been  the  best  possible  with  the  available 
da,ta.  Each  possibility  was  carefully  analyzed  before  the 
final  decisions  were  made.  The  possibility  of  neglecting 
unknown  variables  which  have  considerable  importance  to  the 
runoff  always  exists  in  any  hydrologic  analysis,  because  the 
processes  and  factors  are  not  yet  completely  defined  (Sharp 
and  Biswas,  19 65).  The  variables  were  selected  according 
to  criteria  set  forth  by  Chow  (1964).  They  are  grouped 
into  two  categories,  "climatic”  and  "physiographic."  The 
climatic  factors  are  sub-classified  as  those  measuring 
four  processes  of  hydrology:  precipitation,  interception, 
evaporation,  and  transpiration.  The  physiographic  factors 
have  two  major  sub-sets,  basin  and  channel  characteristics. 

Practically  all  of  the  factors  listed  by  Chow*  were 
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present  in  this  study.  The  precipitation  factors  of  in¬ 
tensity,  duration,  time  distribution,  areal  dis trubution, 
direction  of  storm  movement,  antecedent  precipitation, 
and  soil  moisture  were  directly  or  indirectly  measured 
by  the  variables  I,  D,  TDF,  TPCP,  WDIR,  API,  and  SOLM, 
respectively. 

The  interception  factors  were  probably  not  measured 
as  well  as  the  others,  but  the  land-use  variable,  USE, 
indirectly  measured  the  vegetation.  This  variable,  com¬ 
bined  with  the  week  of  the  year,  and  possibly  the  wind 
velocity,  should  give  any  model  "feeling"  for  the  amount 
of  interception  which  might  occur  in  a  storm. 

An  accurate  amount  of  evaporation  is  extremely  hard 
to  measure,  but  if  the  variables  which  affect  evaporation: 
air  temperature,  soil  temperature,  wind  velocity,  azimuth 
(North-sloping  vs  South-sloping),  areal  extent,  pondage , 
and  possibly  soil  moisture;  are  measured,  then  the  model 
has  probably  been  given  an  estimation  of  the  evaporation 
which  might  occur. 

Transpiration  is  relatively  unimportant  for  un¬ 
vegetated  watersheds.  The  variables  USE,  SOLM,  WEEK,  AIRT, 
WVEL,  and  SOLT  were  included  to  measure  vegetation,  soil 
moisture,  season,  air  temperature,  wind  velocity,  and  soil 
temperature,  and  all  of  these  probably  affect  the  amount 
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of  transpiration  which  occurs  during  a  runoff  event.  Also, 
transpiration  is  generally  negligible  during  periods  of 
rainfall,  and  it  would  be  logical  to  assume  that  small  loss¬ 
es  of  moisture  by  this  process  occurred  during  the  rela¬ 
tively  short  storm  periods  used  herein.  The  variables  DEGD 
and  SWEQ  were  included  because  snow-melt  runoff,  although  a 
function  of  many  variables,  is  at  least  related  to  the  ante¬ 
cedent  temperature  and  the  volume  of  water  in  the  snowpack 
before  an  event.  Snowmelt  runoff  events  comprised  about 
one-third  of  the  runoff  events  of  this  study,  and  a  predic¬ 
tion  equation  applicable  to  these  was  desired. 

The  physiographic  factors  of  size,  shape,  slope,  orien¬ 
tation,  ponds  and  reservoirs,  channel  slope,  and  channel 
length  were  measured  by  the  variables  A,  SHP,  GNUS,  AZ, 

ELEV,  FREQ,  INFR,  POND,  S,  and  L,  respectively.  Some  fac¬ 
tors  suggested  by  Chow  which  were  not  included  in  this 
analysis  were  the  frequency  of  occurrence  of  precipitation, 
atmospheric  pressure,  shape  of  evaporative  surface,  solar 
radiation,  humidity,  ground  water  capacity,  sediment  trans¬ 
port,  and  channel  shape  and  roughness. 

IMPORTANT  VARIABLES  FROM  THE  ANALYSES 

The  variables  selected  as  the  most  important  to  each  of 
the  factors  in  Table  IX  follow  the  interpretation  of  the 
factor  rotation  (Table  VII).  The  variables  I,  ISD,  and 
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SOLT  in  factor  1  were  not  written  according  to  the  factor 
loadings  for  two  reasons.  First,  the  weather  stations  were 
not  recording  for  14  of  the  50  storms,  and  the  treatment 
for  missing  data  had  to  be  applied  to  obtain  the  weather 
station  variables  for  these  storms.  The  intensities  and 
standard  deviation  of  intensities  were  measured  for  all 
storms,  and  the  author  therefore  felt  justified  in  placing 
the  soil  temperature  variable  in  the  third  position  of  im¬ 
portance  to  this  factor.  The  high  loading  in  this  factor 
for  SOLT  was  therefore  probably  due  to  the  missing  data 
treatment  used  for  this  variable.  The  average  soil  temper¬ 
ature  for  the  same  time  period  in  other  years  of  measurement 
may  not  have  been  the  best  choice  for  the  value  to  assign 
to  missing  storms.  However,  the  air  temperature  was  also 
missing  for  these  storms,  and  no  other  alternative  was 
available . 

All  of  the  other  variable  "placings”  in  Table  IX  ap¬ 
pear  to  be  logical.  The  position  of  SOLM  in  factor  2  might 
be  questioned,  not  from  a  hydrologic  point  of  view,  but 
from  an  error  interpretation.  The  measurements  of  the 
soil  moisture  at  any  of  the  three  depths  were  probably 
the  least  accurately  obtained  of  all  the  variables.  The 
soil  moisture  readings  were  taken  from  variations  in 
resistance  measured  across  electrodes  embedded  in  gypsum 
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blocks,  and  the  resistance  readings  were  converted  to  water 
content  values  by  means  of  calibration  charts.  The  read¬ 
ings  did  not  seem  to  fluctuate  considerably  from  day  to 
day,  and  only  occasionally  did  they  change  during  a  storm. 
The  calibration  was  not  well  established,  and  some  of  the 
soil  moisture  values  used  may  therefore  be  questionable. 

In  any  event,  soil  moisture  had  a  small  correlation  with 
all  the  factors,  and  since  no  information  on  the  accuracy 
of  the  readings  was  available,  the  variable  was  discarded 
from  further  analyses.  Intuitively,  the  soil  moisture, 
or  at  least  the  rate-change  in  soil  moisture  shortly  after 
a  storm,  would  indicate  the  amount  of  precipitation  that 
was  being  lost  to  the  soil.  Also,  the  soil  moisture  be¬ 
fore  a  storm  should  indicate  the  capacity  of  the  soil  for 
receiving  the  precipitation.  This  variable  should  there¬ 
fore  not  be  excluded  from  future  analyses  simply  because 
this  study  resulted  in  its  being  deemed  unimportant. 

Variables  deemed  important  in  each  of  the  three  factor- 
regressions  deserve  discussion.  In  the  third  regression, 
four  of  the  six  variables  chosen  from  the  first  six  factors 
appear  in  virtually  all  regression  equations  that  have 
been  proposed  for  both  peak  discharge  rate  and  runoff  vol¬ 
ume.  The  v/atershed  area,  watershed  shape,  storm  intensity, 
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and  total  amount  of  precipitation  are  intuitively  and,  from 
the  results,  shown  to  be  mathematically  important  to  run¬ 
off,  The  high  importance  of  the  maximum  ground  slope, 

GNDS,  could  be  attributed  to  two  possible  explanations. 

Error  of  measurement  may  have  caused  an  "apparant"  im¬ 
portance,  because  the  end  of  the  blue  line  on  U.S.G.S. 
maps  may  not  be  indicative  of  the  actual  main  channel 
head.  When  this  fact  is  combined  with  the  questionable 
precision  of  contours  on  the  maps,  a  strong  possibility 
for  errors  in  the  measurement  of  this  variable  exists. 

An  alternative  explanation,  which  Justified  the  importance 
of  this  variable,  is  the  supposition  that  these  were 
"small”  watersheds.  Because  the  amount  of  overland  flow 
rather  than  channel  flow  is  the  criterion  (Chow,  p.  14-5) 
for  defining  a  small  watershed,  then  ground  slope  is  more 
important  to  peak  discharge  rates  for  smaller  watersheds. 
Neither  of  these  reasons  for  the  importance  of  GNDS  in  the 
rotated  factors  could  be  selected  as  the  better,  and  further 
research  is  suggested  for  the  solution. 

The  sixth  variable  in  the  third  regression  equation, 
pondage,  and  the  main  channel  slope  were  the  only  impor¬ 
tant  variables  in  the  fourth  rotated  factor.  The  great 
difference  between  the  coefficients  for  these  variables 
forced  the  use  of  POND  from  this  factor  rather  than  S. 
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Had  the  correlation  of  S  with  this  factor  been  larger, 
the  author  would  have  used  S  instead  of  POND  in  the  third 
f actor-regression,  because  the  pondage  variable  was  diffi¬ 
cult  to  measure.  Pondage  is  certainly  important  to  run¬ 
off  volumes  and  peak  discharge  rates,  but  the  method  of 
obtaining  this  variable  should  create  some  doubt  about 
the  high  importance  indicated  by  the  analysis. 

All  three  variables  measuring  the  wind  velocity  and 
direction  were  deemed  unimportant  by  the  analysis.  The 
measurements  of  the  variables  were  fairly  accurate,  and 
the  indication  of  the  rotation  is  probably  correct.  Evap¬ 
oration  may  not  have  been  significant  because  the  time 
from  the  beginning  of  precipitation  to  the  end  of  the  run¬ 
off  hydrograph  for  most  of  the  storms  was  relatively  short. 

Duration  and  time  distribution  of  the  storms  were 
other  variables  deemed  unimportant  by  the  analysis.  The 
author  feels  that  this  is  probably  due  to  the  inclusion 
of  snowmelt  events  in  the  analysis.  Because  the  last  major 
snowfall  before  the  respective  snowmelt-runoff  was  used, 
the  duration  was  generally  large  for  these  events,  and  the 
time  distribution  was  generally  zero.  Because  peak  dis¬ 
charge  rates  from  ’'small”  watersheds  are  highly  sensitive  to 
short  duration,  high-intensity  storms  (Chow,  19 64),  the 
large  durations  of  snowfall  storms  probably  caused  the 
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reduction  in  the  importance  of  these  two  variables.  If 
snowmelt  events  had  been  excluded,  one  of  these  variables 
might  have  been  found  to  be  more  important  to  runoff. 

Both  the  variables  measuring  distance  were  deemed 
unimportant  by  the  rotation.  The  meander  length  of  the 
main  channel,  L,  is  usually  not  important  for  " small” 
watersheds,  because  overland  flow  rather  than  channel 
flow  is  more  important  to  peak  discharge  rates.  The  cor¬ 
relation  of  0.80  of  this  variable  with  the  fifth  factor 
in  Table  VII  was  relatively  large  for  rejection,  but  the  im¬ 
portance  of  factor  5  to  the  variance  makes  this  correla¬ 
tion  less  important  than  an  equal  correlation  in  one  of 
the  first  four  factors.  The  variable  is  obviously  impor¬ 
tant  to  the  information,  but  the  quest  for  as  few  variables 
as  possible  for  the  prediction  equation  led  to  its  ex¬ 
clusion  in  the  regression.  The  overland  distance,  GNDL, 
was  not  nearly  as  important  as  the  main  channel  distance, 
which  is  contrary  to  the  previous  definition  of  ''small” 
watersheds.  However,  as  with  the  overland  slope,  this 
may  have  been  difficult  to  accurately  measure,  giving  a 
possible  reason  for  the  apparent  unimportance  of  GNDL. 

VARIABLE  INTERCOKRELATIONS 

The  factors  of  Table  VII  indicate  that  certain  names 
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may  be  applied  to  some  of  the  factors.  Also,  because  the 
factors  are  uncorrelated,  the  important  variables  of  one 
factor  are  relatively  uncorrelated  with  the  variables  of 
another  factor.  The  development  also  shows  that  the  vari¬ 
ables  within  a  factor  are  usually  related.  All  of  these 
facts  may  be  used  to  provide  information  about  the  inter¬ 
correlations  of  the  variables. 

The  first  factor  was  highly  associated  with  storm 
variables  and  snowmelt  variables,  and  might  be  called  a 
"snowmelt"  factor.  As  shown,  the  duration  of  the  storm 
is  correlated  only  with  this  factor,  justifying  the  earlier 
statement  that  the  inclusion  of  snowmelt  events  affected 
the  duration  variable.  Factors  6,  7,  8,  9,  and  10  could 
be  assigned  names  such  as  "wind,”  "total  precipitation," 
"deviate  air  temperature,”  "storm  time  distribution,"  and 
"wind  direction,”  even  though  the  variance  of  each  factor 
was  small . 

Factors  2,  3,  4,  and  5  are  not  as  easily  named  as  the 
rest.  They  are  all  highly  associated  with  physiographic 
variables  only,  indicating  that  they  explain  the  water¬ 
shed  characteristics,  and  that  the  characteristics  are  un¬ 
related  to  the  storm  factors  because  they  are  in  different 
factors.  This  simply  means  that  knowledge  of  the  character¬ 
istics  of  a  watershed  usually  gives  no  insight  into  the 
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storm  characteristics. 

In  the  second  factor,  the  soil  variables  are  preva¬ 
lent.  The  soil  moisture,  infiltration  rate,  and  ground 
slope  could  very  well  be  related.  The  first  has  a  defi¬ 
nite  effect  on  infiltration  rate  (Chow,  1964),  and  the 
ground  slope  would  also  reflect  the  amount  of  moisture 
retained  by  the  soil.  The  additional  presence  of  azimuth 
and  elevation  in  this  factor  led  the  author  to  an  inves¬ 
tigation  of  the  soil  types  as  the  elevation  changed  from 
watershed  to  watershed.  Although  the  soil  types  were 
not  remarkably  different,  the  infiltration  rate  decreased 
as  the  elevation  increased,  indicating  that  these  vari¬ 
ables  may  be  related.  The  correlation  coefficient  for 
the  actual  measurements  of  these  variables  was  0.66,  and 
this  was  among  the  largest  values  for  all  the  correlations. 
Also,  the  azimuth  variable  was  highly  correlated  with  both 
the  elevation  and  infiltration  variables.  It  seems  reason¬ 
able  to  state  that  a  North  or  West-sloping  watershed  might 
contain  different  soils  and  infiltration  rates  than  a 
South  or  East-sloping  basin.  Prevailing  winds,  glacier 
movements,  or  other  factors  could  easily  cause  these  dif¬ 
ferences  . 

Factor  3  indicates  that  the  overland  distance,  stream 
frequency,  watershed  shape,  and  land  use  are  related. 
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Because  the  stream  frequency  varies  approximately  as  the 
square  of  the  "drainage  density”  (Chow,  1964),  which 
is  a  measure  of  the  closeness  of  channel  segments,  and 
because  the  overland  flow  distance  is  inversely  related 
to  drainage  density  (Chow,  1964),  then  the  correla¬ 
tion  of  GITDL  and  FREQ  in  this  factor  is  justified  by  the 
literature.  Also,  the  shape  of  Montana  watersheds  may 
in  fact  be  related  to  the  overland  slope  and  stream  freq¬ 
uency,  although  no  statements  to  this  effect  could  be 
found  in  the  literature.  The  presence  of  USE  in  this 
factor  seems  to  have  no  reasonable  explanation. 

The  fourth  factor  indicates  that  the  main  channel 
slope  and  the  pondage  variables  are  related.  Although 
the  correlation  coefficient  of  these  variables  in  the 
actual  measurement  form  was  small,  a  coefficient  of  -0.85 
was  obtained  for  the  logarithmic  forms,  indicating  a  rela¬ 
tively  linear  plot  on  logarithmic  paper.  The  negative 
correlation  indicates  that  more  ponds  and  reservoirs  were 
found  on  watersheds  having  milder  slopes.  This  could  be 
attributed  to  the  fact  that  the  "steepest1'  watershed, 

Hump  Creek,  had  no  visible  reservoirs  on  the  maps. 

The  close  relationship  of  area  and  main  channel 


length  is  clearly  indicated  by  the  fifth  factor,  where 
these  are  the  only  two  variables  highly  correlated. 
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This  relationship  has  been  known  to  exist  for  several 
years  (Chow,  19 64),  and  the  rotation  simply  adds  justi¬ 
fication  to  the  relationship. 

The  remaining  factors  indicate  that  the  average 
wind  velocity  is  related  to  the  standard  deviation  of 
wind  velocities,  which  is  not  difficult  to  accept.  The 
relationship  between  total  precipitation  and  antecedent 
precipitation  in  the  seventh  factor  is  interesting.  A 
positive  correlation  for  these  variables  indicates  that 
a  "large”  storm  was  typically  preceeded  by  a  "large” 
amount  of  precipitation  in  the  14  days  prior  to  the  be¬ 
ginning  of  the  storm.  The  actual  beginning  of  the  runoff- 
producing  storm  was  sometimes  difficult  to  determine,  and 
this  may  be  the  reason  for  the  relationship  between  these 
variables.  A  runof f-producing  storm  was  frequently  ob¬ 
served  a  day  or  two  after  another  smaller  rainfall  which 
apparently  produced  no  runoff.  The  time  of  concentration, 
or  the  time  for  rainfall  at  the  farthest  reach  of  the 
watershed  to  arrive  at  the  runoff  gaging  station,  assum¬ 
ing  a  uniform  intensity  over  the  watershed,  was  estimated 
to  be  less  than  one  or  two  days  even  on  the  largest  water¬ 
shed.  The  storms  which  had  no  antecedent  help  usually  pro¬ 
duced  runoff  in  only  a  few  hours,  and  this  criterion  was 
used  in  selecting  the  beginning  point  for  the  precipitation 
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which  caused  the  runoff. 

As  shown,  the  methods  employed  seem  to  substantiate 
known  relationships  of  certain  variables.  Some  new  re¬ 
lationships  for  the  locale  of  the  v, watersheds  were  dis¬ 
covered,  and  considerable  insight  into  the  importance  of 
some  of  the  variables  has  been  obtained.  The  ’’simple 
structure”  of  the  rotated  factors  in  Table  VII  provided 
the  information  discussed,  and  the  agreements  with  the 
literature  have  positive  indications  of  the  adequacy  of 
the  methods. 


REGRESSION  EQUATIONS 

The  coefficients  for  the  variables  in  the  regression 
equations  for  total  runoff  and  peak  discharge  rate  were 
listed  in  Table  XI  for  the  four  chosen  combinations  of 
variables.  The  dimensional  ’’imbalance”  of  the  equations 
is  accounted  for  by  the  constant  for  each  equation.  If 
units  other  than  those  in  the  derivation  are  used,  then 
additional  conversion  constants  must  be  added. 

Peak  Discharge  Rate 

In  the  first  equation  for  peak  discharge  rate,  the 
coefficients  on  infiltration  rate,  both  slopes,  soil 
moisture,  snow-water  equivalent,  air  temperature,  degree 
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days,  and  stream  frequency  were  negative,  indicating  that 
the  peak  discharge  rate  increases  as  these  variables  de¬ 
crease.  It  would  seem  that  a  larger  slope  should  produce 
larger  peak  discharge  rates,  because  the  velocity  of  over¬ 
land  and  channel  flow  is  directly  related  to  slope.  Because 
the  peak  discharge  rates  were  measured  in  the  channel,  the 
size,  shape,  and  roughness  of  the  channel  are  also  deter¬ 
minants,  and  the  slope  cannot  be  considered  alone.  Also, 
soil  moisture  and  stream  frequency  are  variables  which 
should  vary  directly  with  peak  discharge  rate.  Negative  co¬ 
efficients  on  "obvious1’  variables  such  as  area,  intensity, 
or  total  precipitation  were  not  present,  and  this  indicates 
that  "reasonable"  coefficients  were  obtained  for  most  of 
the  variables.  The  positive  coefficient  for  pondage  is 
questionable,  because  reservoirs  are  generally  installed  to 
reduce  peak  discharge  rates.  If  measurements  of  all  the 
variables  in  this  equation  are  available,  then  it  should 
give  the  best  prediction  for  peak  discharge  rate  because  it 
explains  93.4  per  cent  of  the  variation  in  the  original 
data,  regardless  of  the  respective  signs  of  the  exponents. 

The  three  factor-regression  equations  for  peak  dis¬ 
charge  rate  each  had  some  intuitively  incorrect  signs 
for  the  exponents.  All  three  indicated  that  total  pre¬ 
cipitation  is  inversely  related  to  peak  discharge  rates, 
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and  only  two  indicated  a  direct  relation  to  intensity. 

This  would  mean  that  short-duration,  high-intensity  storms 
produced  the  greatest  discharge  rates,  confirming  the 
statement  that  these  were  "small*1  watersheds.  The  negative 
coefficient  for  total  precipitation  seems  intuitively  in¬ 
correct,  but  the  definition  of  small  watersheds  allows  an 
inverse  relationship  of  peak  discharge  rate  with  duration, 
and  probably  with  total  precipitation  because  this  generally 
increases  with  duration. 

As  in  the  principal  component  regression  equation, 
the  pondage  variable  for  the  last  three  equations  was 
directly  related  to  peak  discharge  rate.  The  reason  for 
this  apparent  error  is  unexplained,  and  deserves  further 
investigation  if  the  signs  of  the  coefficients  are  accepted 
as  being  correct. 

The  area  was  found  to  be  directly  related  to  peak  dis¬ 
charge  rate  in  all  the  equations,  agreeing  with  the  Ration¬ 
al  equation  for  runoff.  However,  the  ground  slope  coef¬ 
ficient  was  positive  only  in  the  last  equation,  and  the 
lack  of  information  of  the  channel  shape  and  roughness 
might  have  been  at  fault  here.  Also,  this  may  have  been 
caused  by  the  fact  that  few  runoff  events  with  large  peak 
discharge  rates  were  included  for  the  "steepest"  water¬ 
shed,  Hump  Creek. 
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Other  coefficients  for  variabl.es  in  the  equations 
for  peak  discharge  rate  could  not  be  satisfactorily  evalu¬ 
ated,  because  too  little  was  known  about  their  physical 
influence  on  discharge.  The  peculiarities  of  the  re¬ 
search  watersheds  and  measured  storms  may  have  resulted 
in  some  of  the  incorrect  signs.  The  equations  are  not  as 
sensible  as  expected,  but  the  theory  states  that  because 
they  were  obtained  from  uncorrelated  variates,  they  at 
least  satisfy  regression  requirements. 

Total  Runoff 

The  equations  for  total  runoff  also  exhibited  in¬ 
tuitively  incorrect  signs  for  a  few  coefficients.  The  most 
obvious  was  the  inverse  relationship  of  total  runoff  and 
watershed  area  in  all  the  equations.  This  indicates  that 
larger  runoff  volumes  were  produced  from  smaller  water¬ 
sheds,  and  may  be  due  to  the  fact  that  the  largest  run¬ 
off  volume  occurred  from  a  storm  on  Hump  Creek.  This 
storm  had  a  particularly  long  duration  with  a  small  peak 
discharge  rate  and  a  large  total  volume  of  runoff.  The 
possible  influence  of  one  storm  on  the  final  equations  in¬ 
dicates  that  the  method  might  be  overly  sensitive  to  indi¬ 
vidual  storms,  but  this  statement  was  not  supported  by 
the  literature.  Further  studies  are  suggested  if  these 
methods  are  to  be  employed  in  future  regression  analyses. 
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The  ground  slope  was  inversely  related  to  runoff  volume 
for  the  first  three  equations,  and  this  might  be  viewed  as 
incorrect,  because  the  milder  slopes  should  retain  more 
precipitation,  thereby  reducing  the  total  runoff.  However, 
infiltration  rates  and  antecedent  soil-moisture  conditions 
must  also  be  considered.  The  slope  cannot  be  separated 
from  other  variables  in  determining  the  correct  relation¬ 
ship  with  runoff,  and  the  signs  obtained  may  be  correct. 

Another  possible  intuitive  error  in  the  equations  for 
runoff  is  the  signs  for  the  coefficients  of  intensity  of 
precipitation,  which  are  negative  for  all  equations.  High 
intensity  storms  should  reasonably  produce  large  runoff 
volumes;  however,  other  variables  such  as  duration  and 
the  snowmelt  variables  must  also  be  considered.  The  low- 
intensity  snow  storms  with  long  durations  could  have  pre¬ 
deeded  large  runoff  events,  giving  a  possible  explanation 
for  the  inverse  relationship. 

The  total  precipitation  should  obviously  be  directly 
related  to  total  runoff,  and  this  was  found  to  be  true  in 
the  first  three  equations.  The  negative  coefficient  in 
the  fourth  equation  should  definitely  be  questioned  for 
this  variable.  Large  snowmelt-runoff  volumes  for  rela¬ 
tively  small  amounts  of  total  precipitation  could  have 
created  this  error,  but  this  should  have  caused  negative 
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signs  in  all  the  equations  and  may  not  be  a  reasonable 
explanation. 

In  summary,  it  sould  seem  that  the  equations  for 

4 

peak  discharge  rate  and  total  runoff  were  theoretically 
accurate,  but  the  sensibility  of  the  coefficients  was  not 
as  consistent  as  the  literature  had  predicted.  Most  of  the 
incorrect  signs  could  be  justified  by  stating  that  other 
variables  or  watershed  and  storm  peculiarities  were  respon¬ 
sible,  but  some  coefficients  could  not  reasonably  be  placed 
in  this  category.  The  suggestions  for  the  limitations  of 
these  equations  in  the  next  section  should  provide  caution 
in  any  predictions. 

LIMITATIONS  OF  RESULTS 

Statistical  conclusions  from  four  years  of  informa¬ 
tion  on  five  watersheds  may  not  be  reliable.  In  general, 
hydrologists  prefer  a  much  longer  data  period.  However, 
no  frequency  studies  were  made  in  this  report,  and  the  au¬ 
thor  felt  that  the  size-range  of  storms  and  floods  was  well 
represented  by  the  data.  Discharge  rates  ranging  from  10 
to  1720  cfs,  and  runoff  volumes  between  0.006  and  1.806 
inches  were  present.  Because  no  frequency  analyses  were 
performed,  and  because  the  mechanics  of  the  variables  was 
the  prime  objective  of  this  study,  sufficient  data  was 
felt  to  be  present.  The  measurements  of  the  variables 
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were  made  during  each  runoff  event,  and  measurements  of 
the  mechanics  of  the  variables  were  therefore  taken.  Had 
only  a  single  runoff  event  been  observed,  then  the  rotation 
method  should  still  have  indicated  the  important  variables 
for  the  storm,  and  the  regression  analysis  should  have 
yielded  an  equation  which  predicts  that  particular  storm. 
The  use  of  50  runoff  events  simply  extends  the  information 
about  the  mechanics  of  the  variables.  Recognition  of  the 
mechanics  from  the  data  is  attempted  in  this  report,  but  is 
not  yet  complete. 

The  regression  equations  are  probably  not  as  relia¬ 
ble  as  the  information  of  the  variables.  Long  term  vari¬ 
ables  such  as  annual  snow  fall,  solar  radiation,  pre¬ 
vailing  winds,  or  watershed  "treatments*’  such  as  con¬ 
servation  practices,  crop  rotation,  etc.,  could  not  be 
measured  in  the  data  period.  Prediction  with  the  equa¬ 
tions  should  probably  be  confined  to  storms  occurring 
within  a  time  period  which  is  not  affected  by  the  long 
term  variables.  This  period  remains  to  be  determined. 

With  the  above  limitations,  the  equations  should 
be  suitable  for  use  on  ungaged  watersheds  in  the  locale 
of  the  research  watersheds.  Measurements  or  estimations 
of  the  independent  variables  must  be  made,  and  some  of 
these  may  be  difficult  to  obtain,  particularly  in  the 
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longer  equations.  The  variables  in  the  last  equation  can 
be  obtained  from  contoured  topographic  maps  and  from  esti¬ 
mations  of  the  storm  intensity  and  volume.  Design  storms 
will  require  estimates  of  the  T-year  intensities  and 
volumes,  which  would  hopefully  give  the  T-year  peak  dis¬ 
charge  rates  and  runoff  volumes.  This  is  conjecture  at 
this  point,  and  frequency  analyses  of  the  data  would  have 
to  be  made  before  this  relationship  could  be  used. 

Use  of  values  of  the  independent  variables  ’’outside” 
the  range  of  each  variable  is  usually  discouraged  in  lin¬ 
ear  regression  equations.  The  reasoning  here  is  that  the 
equation  predicts  a  line,  plane,  or  hyperplane  whose  bound¬ 
aries  are  the  limits  of  the  measured  independent  variables. 
For  example,  a  simple  linear  regression  for  one  variable 
in  terms  of  another  would  yield  a  straight  line  through 
the  data  points  on  a  plot  of  the  two  variables.  If  only 
a  small  range  of  values  for  the  variables  was  used,  the 
line  may  be  only  a  segment  of  a  larger  line,  which  may  be 
curvilinear.  Prediction  outside  the  range  of  the  variables 
would  yield  a  point  on  the  extended  line,  but  this  line  may 
not  represent  the  total  curve.  The  actual  and  transformed 
variable  means  and  standard  deviations  listed  in  Appendix 
H  may  be  used  as  a  guide  in  determining  the  range  for  each 


variable . 


[ 

[ 

I 

I 

[ 

i 

[ 

[ 

[ 

1 

[ 


Chapter  VI 

CONCLUSIONS  AND  RECOMMENDATIONS 

CONCLUSIONS 

Two  objectives  for  this  investigation  were  presented 
in  Chapter  I.  The  first  was  the  determination  of  which 
of  the  29  independent  variables  were  more  important  to 
the  dependent  variables:  peak  discharge  rate  and  runoff 
volume  from  small,  central  and  eastern  Montana  watersheds. 
The  second  objective  was  the  derivation  of  regression  equa¬ 
tions  for  the  dependent  variables  in  terms  of  the  important 
independent  variables. 

It  is  believed  that  the  analyses  adequately  fulfilled 
the  first  objective.  The  variables  which  contributed  most 
to  the  peak  discharge  rates  and  runoff  volumes  were  in 
general  agreement  with  the  most  important  variables  listed 
in  the  literature.  The  following  variables:  precipitation 
intensity,  standard  deviation  of  precipitation  intensities, 
soil  and  air  temperature,  watershed  azimuth,  overland  slope, 
watershed  shape,  reservoir  area,  and  watershed  area  were 
among  the  most  successively  important  independent  variables. 
The  variables  related  to  snowmelt  runoff  events  were  also 
important,  indicating  that  some  variables  were  important 
or  unimportant  because  snowmelt  runoff  events  were  included 
in  the  investigation. 


. 


96 


The  principal-component  and  rotated-factor  regression 
equations  for  the  peak  discharge  rate  and  runoff  volume 
from  the  watersheds  were  not  as  consistent  as  expected. 

The  equations  were  derived  from  independent  components  and 
factors,  and  therefore  satisfy  multiple  regression  as¬ 
sumptions.  The  equations  for  the  dependent  variables  in 
terms  of  29  independent  variables  exhibited  the  most  in¬ 
tuitively  correct  relationships  among  the  variables.  When 
variables  were  discarded  for  the  rotated-factor  regression 
equations,  some  relationships  of  dependent  and  independent 
variables  became  intuitively  incorrect,  especially  when 
only  six  independent  variables  were  used. 

RE COML'IEND AT  I  Oil S  FOR  FUTURE  RESEARCH 

Because  Wallis  (1968)  published  his  paper  after  the 
present  investigation,  some  of  his  suggestions  were  not 
included  herein,  and  are  presented  as  recommendations  for 
continued  research.  The  following  analyses  are  suggested 
to  comply  with  his  recommendations:  From  the  rotated 
factors  of  Table  VII,  select  only  two  variables  from  the 
first  and  second  factors,  and  one  variable  from  the  third, 
fourth,  fifth,  and  seventh  factors,  yielding  eight  inde¬ 
pendent  variables,  I,  AIRT,  AZ,  G-NDS,  SHP,  POND,  A,  and 
TPCP,  respectively.  Perform  a  principal  component  analy¬ 
sis  of  the  correlations  of  only  these  variables,  compute 
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the  principal  factors,  and  rotate  the  factors.  If  an y  of 
the  variables  are  unimportant  to  all  the  rotated  factors, 
repeat  the  analysis  for  the  important  variables.  If  all 
the  factors  are  important,  compute  the  principal  component 
regression  equation  for  these  variables.  This  analysis  may 
or  may  not  yield  a  better  prediction  equation  than  any  of 
those  included,  but  the  results  should  prove  interesting. 

Because  the  inclusion  of  snowmelt  runoff  events  in  the 
present  investigation  caused  some  of  the  incorrect  signs 
in  the  regression  equations,  the  separation  of  snowmelt  and 
rainfall-produced  runoff  events  is  recommended  in  future 
analyses.  The  methods  herein  are  suggested  for  use  in 
determining  important  variables  to  each  type  of  runoff 
event. 

The  information  about  the  important  variables  and  re¬ 
lated  variables  could  be  used  in  any  future  study.  Al¬ 
though  the  termination  of  the  measurements  of  the  unimpor¬ 
tant  variables  is  not  recommended,  the  author  does  suggest 
that  concern  for  precise  measurements  of  many  variables 
may  be  "relaxed"  in  favor  of  better  measurements  of  the 
important  variables.  The  amount  of  data  used  and  the  use 
of  this  single  analysis  indicate  that  absolute  determina¬ 
tions  of  totally  unimportant  variables  are  not  contained 
herein.  This  report  should  serve  as  a  guide  for  any 
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future  analyses  which  question  the  relative  importance  of 
certain  variables,  the  possible  relationships  of  certain 
variables  with  each  other,  or  the  prediction  equations  for 
the  dependent  variables. 

SUMMARY 

The  analysis  of  peak  discharge  rates  and  runoff  volumes 
from  five  central  and  eastern  Montana  watersheds  for  the 
important  causative  factors  and  the  principal  component 
and  factor  regression  equations  is  contained  in  this  re¬ 
port.  Prediction  of  peak  discharge  rates  and  runoff  volumes 
for  ungaged  watersheds  in  the  locale  of  the  research  water¬ 
sheds  is  recommended  with  the  developed  equations  if  certain 
limitations  are  recognized.  The  results  obtained  and  the 
agreements  with  the  literature  suggest  that  multivariate 
statistical  methods  are  suitable  for  use  in  "multi-vari¬ 
able"  hydrologic  studies  for  obtaining  information  about 
the  relative  importance  of  each  of  the  variables,  infor¬ 
mation  about  the  intercorrelations  of  the  variables,  and 
satisfactory  regression  equations  for  the  runoff  variables. 
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APPENDIX  A 


Graphical  Derivation  of  Principal  Component  Theor y 

The  procedure  for  finding  the  lv.  in  Equation  (4) 

K  j 

can  he  viewed  graphically  to  give  a  physical  concept 
of  variance.  From  Equation  (4),  the  jifl  component  is 
seen  to  he  linearly  related  to  the  standardized  variables. 

If  the  variables  are  considered  as  axes  in  a  p-dimensional 
space,  then  any  principal  component  is  simply  a  line  through 
this  set  of  axes.  One  equation  of  a  line  in  p  dimensions 
is  (Kendall,  1957) 


where  the  are  intercepts  with  the  x^  axes,  and  the  1^ 
are  the  direction  cosines.  A  two-dimensional  example  of 
this  form  is  shown  in  Figure  A1 ,  where  1^  and  1^  are  the 
cosines  of  angles  0^  and  0^  .  The  notation  for  direction 
cosines,  1,  and  for  the  coefficients  in  Equation  (4)  is 
the  same  because  these  values  are  later  shown  to  be  the 
same . 

The  measurements  of  the  variables,  after  being  stand¬ 
ardized,  could  be  plotted  as  points  in  the  p-dimensional 
space,  just  as  points  P^  are  shown  in  Figure  A1 .  Finding 
the  component  which  reproduces  a  maximum  amount  of  the 
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information  is  identical  to  drawing  a  line  through  the 


points  in  a  manner  which  minimizes  the  perpendicular  dis¬ 


tances  of  the  points  to  the  line.  The  first  impulse  is 


to  write  an  equation  for  the  sum  of  the  distances  and 

I 

then  minimize  this  sum.  However,  some  of  the  distances 


are  negative,  and  cancellation  effects  are  encountered. 


To  rectify  this,  the  distances  can  be  squared,  eliminating 


the  negative  signs,  and  the  sum  of  squared  distances  can 


be  minimized,  which  is  simply  least-squares  theory. 


Mathematics  textbooks  give  the  equation  for  the 


tance  from  point  to  a  line  in  two  dimensions  as 


dis- 


[ 

i 

I 

[ 

i 

[ 


Axli  +  Sx2i 


A2  +  B2 


if  the  line  has  the  equation 


Ax^  +  3x2  =  ^ 

The  equation  chosen  for  the  line  in  Figure  A1  re¬ 
duces  to 


(A2 ) 


which  has  the  form  of  Equation  (A2).  Substituting  for  A, 
3,  and  C,  and  squaring  gives 


-TV  . 


2 


Since  p  dimensions  instead  of  tvro  are  used  in  this  study, 

4~  Vi 

then  the  squared  distance  from  the  i—  point  to  the  line 
of  Equation  (Al)  is 


d. 
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P 
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k=l 
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and  the  sum  of  all  squared  distances  for  the  n  observations 
is 

n 

51  S.  =  nS 

which  is  the  form  given  by  Kendall  (19 57 »  p.  14).  It  is 

desired  to  find  the  and  1^  which  minimize  this  sum.  Upon 

equating  the  partial  differentials  of  this  equation  with 

respect  to  the  mv  to  zero,  Kendall  shows  that  the  line 

passes  through  the  origin  of  the  standardized  variables, 

and  the  m  are  all  zero.  Before  partial  differentiation 
k 

of  Equation  (A3)  with  respect  to  1^,  the  condition  that 

Zi. 

£:  ik2  =  i  (A4j 

k=l  1 

must  be  included,  since  the  squares  of  the  direction  co¬ 
sines  of  any  line  must  sum  to  unity.  Although  Kendall 
does  not  show  it,  he  adds  the  value 

P  2 

nL(H  1,  -  1)  =  0 

k=l  * 

to  Equation  (A3)  before  differentiating.  This  is  done 
to  assure  the  investigator  that  Equation  (A4)  will  apply 
when  the  values  of  1  are  found.  The  value  L  (the  eigen¬ 
value)  is  a  constant  that  may  be  zero,  but  it  must  be 


n 

s: 

i=l 


'  p 

2 

2 

zr  (xki  - 

k=l 

mk)  - 

ZZ  lk(xyi  -  m  ) 
[k=i  K  q 

/ 

(A3) 
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considered  unless  it  is  found  later  to  be  zero.  Propping 
m,  in  Equation  (A3),  adding  Equation  (A5),  and  differenti¬ 
ating  with  respect  to  1  gives  the  set  of  equations 


i  P  n 

H 

k=l  i=l 


Ji 


-  L  1.  =  0;  j  =  1,  2,  . p  (A6) 


Since  the  correlation  between  two  standardized  variables 
is  (Harman,  1967,  p.  13) 


1  A 

—  x  . .  x,  . 
n  i=1  Oi  ki 


jk 


(A?) 


then  Equations  (Ao)  can  be  written  as 


21  1], 

k=l 


L  1  . 
J 


=  0;  j  =  1 ,  2  ,  .  .  .  ,  p 


(A8) 


Noting  that  the  correlation  of  a  variable  with  itself  is 
unity,  Equations  (A8)  can  be  expanded  to 


11(1-L)  +  12  r 


12  +  •  •  •  +  A  riP  =  0 

11  r21  +  pC-L)  +  '  *  '  +  1p  r2p  =  0 


•  • 

•  • 

1i  rpi  +  h  rP2  +  •  •  •  +  1p(1_L)  =  0 

where  the  1^  and  L  are  to  be  determined.  These  equations 

can  be  written  in  matrix  form  as 


|H  -  Ll||l|  =  0 

where  |l|  is  a  vector  of  direction  cosines.  If  all 
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direction  cosines  were  zero,  then  all  the  angles  between 
the  desired  line  and  the  x^  axes  would  be  right  angles. 
This  means  that  the  line  would  be  perpendicular  to  all 
the  axes.  Since  this  is  not  likely,  the  1-vector  can  be 
eliminated  giving  the  matrix  of  Equation  (10)  in  Chapter 
III. 
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APPENDIX  3 

Graphical  Derivation  of  Varinax  Rotation  Theory 

Graphically,  the  principal  components  are  reference 
axes  for  the  independent  variables.  If  two  components 
and  two  independent  variables  were  present,  then  the 
components  could  be  represented  by  the  axes  VI  and  V 
in  Figure  31.  Since  the  components  are  linear  equations 


Figure  31.  Graphical  Representation  of 

Principal  Components 


involving  the  standardized  independent  variables,  as 
shown  in  Equation  (4) ,  then  the  variables  can  also  be 
written  in  terms  of  the  s  important  components  (Kendall, 


L 
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p.  1 6).  The  variable  equations  are 


x 


ki 


(Bl) 


where  the  terms  are  the  same  as  for  Equation  (4)  .  If 
the  V.  are  considered  to  be  directional  unit  vectors,  then 
the  1^  are  projections  of  the  M variable"  vectors,  x^., 
onto  the  V  axes,  and  Equation  (31)  defines  the  position 

vector  for  variable  x,  in  the  reference  frame  of  Figure  El. 

K. 

The  purpose  of  rotation  is  to  rotate  the  axes 
and  Vg  through  an  angle  6  to  the  new  positions,  1  and 
1 ,  so  that  the  projections  onto  the  new  axes  are  as 
large  as  possible.  This  means  that  the  new  components 
will  have  either  large  or  small  coefficients  on  the  vari¬ 
ables,  and  better  interpretations  of  the  importance  of 
the  variables  will  be  possible.  Coefficients  similar  to 
those  of  the  original  components  (i.e.,  coefficients  that 
are  large  for  all  the  variables)  will  still  be  present, 
but  if  large  coefficients  are  maximized,  some  small  coef¬ 
ficients  will  be  present  in  the  rotated  components. 

Kaiser  obtained  his  best  interpretations  from  "normal 
varimax  rotation"  (1959)*  In  order  to  use  his  methods, 
the  components  must  first  be  converted  to  factors  by  multi¬ 
plying  the  coefficients  of  each  component  by  the  component’s 
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eigenvalue,  as  shown  by  Equation  (12),  or 


A. 

0 


which  gives  the 


factor 


the  equation 


K 


(12) 


if  Equation  (4)  is  substituted  for  V..  (Baggaley  (1964, 

J 

p.  256)  was  concerned  with  the  effect  of  this  modification 
of  the  components  on  the  interpretation  of  the  variables, 
and  found  that  nothing  was  changed.  The  variance  of  a 
factor  is  the  same  as  the  variance  of  a  component  (Harman, 
p.  16),  and  the  contribution  of  both  to  the  total  variance 
is  the  same .  ) 

After  the  factors  are  found,  the  variable  vectors 
are  given  unit  length  by  dividing  each  coefficient  by 
the  length  of  the  vector.  The  length  is  simply  the  square 
root  of  the  sum  of  squared  coefficients.  The  new  "norm¬ 
alized”  variable  vectors  are  therefore  defined  by 


V. 

J 


(32) 


for  the  s  components ,  where 
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(B3) 


and 


\  = 


E 

\j  j=i 


W 


This  procedure  results  in  variable  vectors  having 
unit  lengths  and  plotted  in  a  reference  frame  with 
factors  for  axes,  as  shown  in  Figure  B2 .  The  desired 


Figure  B2 .  Normalized  Variable  Vectors  Plotted  in 

a  Factor  Reference  Frame 
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angle  of  rotation  is  still  0,  and  the  projections  to  be 
maximized  are  now  the  b-  the  projections  on  the  rotated 
factors.  As  with  the  principal  component  derivation  in 
Figure  A1 ,  some  of  the  projections  may  be  negative,  and 
squaring  eliminates  the  minus  signs.  However,  the  sum 
of  these  squared  projections  is  not  the  variance  of  a 
factor,  which  is  to  be  maximized.  Graphically,  the  var¬ 
iance  of  a  factor  is  found  by  subtracting  the  average 
squared  projection  on  the  j~~.  factor  from  each  squared 
projection,  and  then  squaring  this  difference  to  eliminate 
minus  signs.  The  average  squared  projection  on  the  kJA* 
factor  is  given  by 


if  there  are  p  variable  vectors.  Subtracting  this  from 
each  squared  projection  on  the  k—  factor,  squaring  the 
difference,  and  summing  for  all  factors  gives  the  sum 


w 


s 

X  (b  ' 
jtl  kj 


2  ^ 
A  } 


(B4) 


This  is  the  sum  of  all  the  squared  differences  for  the 
p  variables  and  s  factors,  and  it  represents  the  sum  of 
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squared  projections  on  the  rotated  factors.  If  this  sum 
is  divided  by  p,  then  an  ‘'average"  squared  projection  on 
the  factors  is  approximated.  Kaiser  defines  the  variance 
of  a  factor  as  this  "average",  and  his  equation  to  be  max¬ 
imized  is  given  (1959)  "by  Equation  (15)  in  Chapter  III. 
Equation  (15)  is  obtained  by  dividing  Equation  (34)  by 
p,  and  expanding  the  squared  term. 

After  the  b,  .  which  maximize  Equation  (15)  are  found, 
k  J 

the  coefficients  of  each  final  variable  vector  are  multi¬ 
plied  by  the  lengths  of  the  original  vectors  to  place 
them  back  in  the  same  relative  perspective  before  they 
were  normalized,  rather  than  allowing  each  to  have  unit 
length  (Kaiser,  1959).  The  rotated  factors  therefore 
have  the  final  equations 


where 


P 

ZI  c.  .  x. 
k=l  ^3  ki 


(B5) 


c,  .  =  hv  b,  . 
kj  k  kj 

and  the  c^j  provide  the  information  for  interpretation 
of  the  importance  of  the  variables  to  the  rotated  factors, 
and  therefore  to  the  information  present  in  the  data.  The 
values  of  Equation  (22)  are  the  same  as  these  c^. 
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APPENDIX  C 


Descriptions  of  Independent  Variable s 

The  variable  numbers,  names,  and  definitions  listed 
below  describe  the  29  independent  variables  of  Table  II 
in  Chapter  IV.  Each  definition  gives  the  units  of  the 
variable  in  parantheses,  the  methods  of  obtaining  each 
variable,  and  the  source  of  the  data  for  ea.ch  measurement 
of  each  variable. 


Phy s i o graphic  Variables 


1.  A 


Area  of  watershed  ( sq  mi)  from  SC3  soil  map 
(scale:  4  in.  =  1  mi). 


2 .  SHP  : 


Sha,pe  of  watershed  (dimensionless)  from  the 
ratio  of  the  length  of  the  longest  line  that 
could  be  passed  through  the  center-of-area  of 
the  watershed  to  the  length  of  a  perpendicular 
through  this  point.  Center-of-area  found  from 
the  balance  point  of  a  cut-out  of  the  water¬ 
shed  from  3CS  soil  map. 


3.  A Z  : 


Azimuth  of  main  channel  (deg)  measured  clock¬ 
wise  from  true  North  when  referred  to  the 
water-stage  recorder.  Mean  azimuth  of  straight- 
line  sections  of  main  channel  computed  from 


AZ  = 


E( AZi  Li) 


where 


AZ, 


=  azimuth  of  each  straight-line 
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segment,  starting  at  gaging  station 
and  ending  at  closest  point  on  basin 
divide  to  end  of  main  channel  on 

SCS  soil  map, 

and, 

=  length  of  each  straight  line  seg¬ 
ment. 

4.  E1SV: 

Elevation  of  watershed  (ft)  computed  from  an 
average  of  the  elevations  at  0.2  and  0.8  of 
the  main  channel  straight-line-segment  length 
from  the  gaging  station  to  the  nearest  point 
of  basin  divide  from  head  of  main  channel. 
Elevations  from  contoured  topographic  map  and 
distances  from  chartometer  of  SCS  soil  map. 

5 .  GND3 : 

Maximum  ground  slope  (ft/ft)  computed  by  divi¬ 
ding  the  distance  from  the  head  of  the  blue, 
main- channel  line  on  USGS  Quadrangle  map  to  the 
nearest  point  on  the  basin  divide,  by  the 
change  in  elevation  between  these  points. 

6.  GNDL: 

Overland  distance  (mi)  of  flow  obtained  from 
distance  from  head  of  blue,  main-channel  line 
on  USGS  Quadrangle  map  to  nearest  point  on 
basin  divide. 

7.  FREQ: 

Stream  frequency  ( 1/sq  mi),  or  number  of  stream 
segments  per  unit  area.  Obtained  by  dividing 
the  total  number  of  stream  segments  of  all 
orders  by  the  area  of  the  watershed.  Orders 
of  stream  segments  determined  by  numbering 
heads  of  tributaties  ls_b  order,  and  having 
two  1st  order  streams  joining  to  a  2nd  order, 
etc.  Count  taken  from  SCS  soil  map. 

8.  S  : 

Main  channel  slope  (ft/ft)  computed  as  the 
slope  of  a  straight  line  drawn  through  the 
main  channel  straight-line-segment  profile. 

The  straight  line  is  fitted  to  the  profile 
in  such  a  way  as  to  give  the  same  area  under 
the  line  as  the  total  area  under  the  profile, 
using  equal  base  lengths.  Profile  from  contour¬ 
ed  Quadrangle  maps  (for  Duck  and  E.  F.  Duck 

Creek  watersheds,  profile  from  transi t- stadia 
surveys ) . 
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9*  L  :  Main  channel  meander  length  (mi)  from  chartom- 
eter  of  SCS  soil  map  main  channel  (largest 
order  stream),  including  distance  from  end  of 
blue  line  to  nearest  point  on  basin  divide. 

10.  USE  :  Land  use  ratio  (dimensionless )  obtained  from 

a  ratio  of  the  vegetated  (including  forests  for 
Hump  Creek)  area  to  the  bare-cultivated  (sum¬ 
mer  fallow)  area.  Relative  areas  were  deter¬ 
mined  subjectively  from  each  section  in  the 
aerial  photographs. 


11.  INFR:  Infiltration  rate  (in. /hr)  of  watershed,  weight¬ 
ed  for  rates  of  different  soil  types.  Computed 
from, 


INFR 


Idi  Ap 

£Ai 


where , 

1^  =  average  measured  infiltration  rate 
on  each  soil  type  of  watershed. 

If  not  measured,  soil  was  clas¬ 
sified  by  SCS  method  (type  A,  3, 

C,  or  D) ,  and  the  average  for  all 
watersheds  was  applied  to  any 
soil  types  which  were  not  tested, 

and, 

=  area  of  each  soil  type  on  water¬ 
shed  from  planimeter  of  SCS  soil 
map. 


12.  POND:  Distance-weighted  per  cent  of  total  area  in 

ponds  and  reservoirs  {%)  computed  from, 


I(L.  AP-) 

POND  =  - = - (100) 

A  Eg 


where , 
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L.  =  straight-line-segment  distance 
1  along  main  channel  from  gaging 
station  to  each  pond  on  SC3  soil 
map, 

AP^  =  full  surface  area  of  each  pond  or 
reservoir  from  planimeter  of  SC3 
soil  map, 

and, 

A  =  area  of  watershed. 


Storm  Variables 


13.  I 


14.  ISD 


15.  D 


Intensity  of  precipitation  (in. /hr)  computed 
as  mean  of  all  hourly  intensities  from  all 
recording  precipitation  stations  from  beginning 
of  storm  on  the  first  recording  station  to 
hour  of  last  runoff  from  hydrograph.  Begin¬ 
ning  of  storm  determined  from  precipitation 
hyetographs.  (Last  major  snowfall  used  for 
snowmelt  runoff  events.) 

Standard  deviation  of  precipitation  intensit¬ 
ies  used  in  variable  13,  computed  from, 


2  1 
(ISD)  =  —j 
n-1 


where , 

n  =  total  number  of  hourly  intensities 
recorded  for  all  recording  precip¬ 
itation  stations. 

Duration  of  storm  (hr)  computed  as  an  average 
time  to  the  center  of  area  of  rainfall  hyeto¬ 
graphs  for  all  recording  precipitation  stations, 
computed  from, 


V-  2  <£li> 

z  g  — 


n 


ZCDCAj.  Ai) 
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and, 


where , 

DCA^  =  time  in  hours  from  beginning  of 
each  rainfall  hyetograph  to  the 
center  of  area  of  each  hyetograph, 
where  hyetograph  ends  with  last 
precipitation  before  hour  of  end 
of  runoff, 


A.  =  Thiessen  area  of  each  recording 
precipitation  station  from  plan- 
imeter  of  3GS  soil  nap, 

T.  =  time  in  hours  from  beginning  of 
1  hyetograph  to  I ^ ,  where, 

I.  =  hourly  precipitation  at  the  respec 
tive  recording  station. 


16.  TDP  :  Time  distribution  factor  (hr)  of  precipitation 

over  watershed,  computed  as, 

TDF  =  D1  -  D 


where , 


D1 


SDGAj  A.) 


Za, 


and, 

DGA.  =  time  in  hours  from  beginning  of 

J  first  appearing  rainfall  hyetograph 
to  the  center  of  area  of  each 
hyetograph, 

and, 

A .  =  Thiessen  area  of  each  recording 
J  precipitation  station  from  plan- 
imeter  of  SGS  soil  map. 
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17.  TPCP:  Total  precipitation  (in.)  computed  as  an 

average  Thiessen-weighted  precipitation  from 
all  precipitation  stations,  or, 


TPCP 


Z(A, 


Pi5 


where , 

P^  =  total  storm  precipitation  at  each 
recording  and  non-recording  precip 
itation  station  from  beginning  of 
storm  to  end  of  runoff, 

and, 

A.  =  Thiessen  area  of  each  station  from 
1  planimeter  of  SCS  soil  map. 

18.  API  :  Fourteen-day  antecedent  precipitation  index 

(in.)  computed  from, 

APIi  =  k  (API.^)  +  (Pi) 

where , 

APIj_  =  antecedent  precipitation  index  for 
the  ith  day  after  the  fourteenth 
day  before  the  first  day  of  the 
storm, 

k  =  0.78,  a  reduction  constant  to 
indicate  the  evaporation- trans¬ 
piration  losses  from  day  to  day. 
(Usually  ranges  from  0.80  to  0.95 
(Chow,  p.  25-102),  and  small  value 
used  here  for  relatively  dry  water 
sheds . ) , 

P.  =  Thiessen- weighted  average  daily 
1  precipitation  for  each  station, 

and, 

API^  =  API  for  first  day  of  storm. 
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19.  SOLM:  Average  water  content  {%)  of  soil  for  all 

stations  for  3»  9»  and  18-inch  depths,  at  "be¬ 
ginning  of  first  hye to graph. 


20.  WDIR:  Predominant  wind  direction  during  storm  period 

(dimensionless),  obtained  from  weather  station 
data,  and  given  the  numbers  1,  2,  3»  ...»  8  to 
represent  a  wind  from  the  N,  NW,  W,  SW,  S,  SS, 
S,  NS,  respectively. 


21. 


i.rym 

i  i-jj  j 


K: 


Week  of  the  year  (dimensionless)  in  which  the 
date  of  the  peak  discharge  rate  fell.  Weeks 
were  numbered  from  the  first  week  in  Jan. , 
regardless  of  the  number  of  days  in  this  week. 


22.  AIRT:  Average  Thiessen-weighted  air  temperature 

(deg  P) ,  4  ft  above  ground,  during  storm  period. 

Given  by, 

Z(T,  A,) 

AIRT  =  - — : - — 

Pa, 

where , 

It  =  mean  hourly  temperature  at  a  sta¬ 
tion  during  the  time  from  beginning 
of  first  precipitation  to  end  of 
runoff  hydrograph , 

and, 

A^  =  Thiessen  area  of  each  weather  sta¬ 
tion  on  the  watershed. 


23.  ATSD:  Average  Thiessen-weighted  standard  deviation 

of  air  temperatures  (deg  F),  4  ft  above  ground, 
during  the  storm  period.  Given  by, 


and, 
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where , 


ATSD 


Ss^  Ai) 


ST^  =  standard  deviation  of  hourly  air 
temperatures  at  a  weather  station 
during  the  storm  period, 

n  =  number  of  hourly  air  temperatures 
considered, 

=  hourly  air  temperature  at  4  ft 
above  ground  at  a  station, 

and, 

A^  =  Thiessen  area  of  each  weather  sta¬ 
tion  on  the  watershed. 


24.  WVEL:  Mean  wind  velocity  (mph)  during  the  storm  pe¬ 
riod. 

25*  WSD:  Standard  deviation  (mph)  of  hourly  wind  ve¬ 
locities  during  the  storm  period. 

26.  SOLT:  Average  Thiessen-weighted  soil  temperature 

(deg  F) ,  3  in.  below  surface,  during  storm 
period.  Computed  exactly  as  air  temperature 
mean,  only  using  four  readings  per  day  at 
6-hr  intervals,  beginning  at  3:00  a.m. 


27.  STSD:  Average  Thiessen-weighted  standard  deviation 

of  soil  temperatures  (deg  F) ,  3  in.  below  sur¬ 
face,  during  storm  period.  Computed  in  the 
same  manner  as  the  air  temperature  standard 
deviation. 


28.  DEGD:  Average  Thiessen-weighted  Degree -Days  (deg  F) 

for  a  14-day  period  prior  to  the  beginning  of 
runoff.  Degree-Days  for  each  day  was  maximum 
daily  temperature,  and  l4~day  average  was, 


•  '  ,  .  i 
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5 (DD,  A,  ) 

D3GD  = - - - i~ 

14  X  Aj_ 

where , 

DDj_  =  total  Degree-Days  for  a  weather 

station  for  a  1 4-day  period  before 
the  first  day  of  runoff, 

and, 

=  Thiessen  area  of  each  weather 
station  on  the  watershed. 

29.  SWEQ:  Snow-water  equivalent  (in.),  or  the  volume  of 

water  in  the  snowpack  on  the  watershed  on  the 
date  of  the  beginning  of  runoff.  Computed 
from  Thiessen-weighted  volume  from  each  snow 
course.  Loss  or  gain  in  volume  from  date  of 
snow  survey  to  date  of  runoff  was  computed  as 
the  difference  between  accumulated  precipitation 
and  accumulated  runoff.  This  was  added  to  the 
Thiessen-weighted  volume  on  the  watershed  at 
the  time  of  the  last  snow  survey,  giving  the 
volume  on  the  date  of  beginning  of  runoff. 
Thiessen  areas  of  snow  courses  were  used  to 
give  the  volume  on  the  date  of  the  survey,  or, 

'Z(VC1  Ai) 

Tq 

where , 

VC^  =  snow-water  equivalent  at  a  snow 
course , 

and, 

=  Thiessen  area  of  each  snow  course 
on  the  watershed. 


' 

121 


APPENDIX  D 


C 

C 

C 

C 

r 

C 

c 

c 

c 

c 

Q 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 
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THIS  PROGRAM  COMPUTES  VALUES  FOR  VARIABLES  13,  14, 
15,  16,  AND  17  USING  RECORDING  AND  NON-RECORDING  PRECIP 
DATA  IN  USWB  FORM. 

INPUT  FOR  A  WATERSHED  CONSISTS  OF 
CARD  1  NO.  OF  RECORDING  PRECIP  STAS.,  NO.  OF  RUNOFF 

EVENTS,  NO.  OF  NON-RECORDING  PRECIP  STAS.,  AREA 
OF  WSHD. 

CARD  2  THIESSEN  AREAS  OF  RECORDING  PRECIP  STAS. 

CARD  3  THIESSEN  AREAS  OF  ALL  PRECIP  STAS. -SUM  TO  AREA. 

CARD  4  LAST  FOUR  DIGITS  OF  STA  NOS.  OF  REC.  AND  NON- 
REC.  PRECIP  STAS. 

CARD  5  DATE  OF  PEAK  DISCHARGE,  DATE  FIRST  RECORDING 

PRECIP,  HOUR  FIRST  REC.  PRECIP,  DATE  FIRST  REC. 
PRECIP  EACH  STATION,  HOUR  FIRST  REC,  PRECIP 
EACH  STA.,  DATE  FIRST  NON-REC.  PRECIP  EACH  STA, 
DATE  END  OF  RUNOFF,  AND  HOUR  END  OF  RUNOFF,  ALL 
FOR  FIRST  EVENT. 

CARD  6  SAME  DATA  ON  CARD  5  FOR  SECOND  EVENT,  ETC. 

CARD  7  RECORDING  PRECIP  DATA  IN  ORDER  OF  STAS. -CARD  2. 

CARD  3  NON-REC.  PRECIP  DATA  IN  ORDER  OF  STAS. -CARD  2. 

CARD  9  BLANK  CARD  TERMINATES  PROGRAM. 

1  FORMAT  { 1X,4A4,5X, 12 ,5X, 12 ,5X, 12 ,5X,F8.2 ) 

2  FORMAT  ( 1 X , 7  F  8 . 2 ) 

3  FORMAT  (IX, 14) 

4  FORMAT  (1X,3I2,5X,3I2,5X,I2,5X,3I2,5X,I2,5X,3I2,5X,3I2, 

1  5  X  ,  I  2  ) 

5  FORMAT  ( 2  X  ,  14, 312, IX, 12 A3  ) 

60 FORMAT  (IX,  35H  CARDS  OUT  OF  ORDER  FOR  STATION  NO., 15* 

1 1 3  H  BEGINNING  ON,  2 (  I  2  , 1 H /* ,  I  2 / 1 9H  PROGRAM  TERMINATED  ) 

7  FORMAT  { 2 X,  14,3  12 ,10X,A4) 

200  FORMAT  (  1  HI ) 

30 FORMAT  (10X,  37H  VARIABLES  13,  14,  15,  18,  AND  19  FOR, 

1 4 A4 , 1 OH  WATERSHED  //15X,5H  PEAK,5X,5H  MEAN , 5X , 

2 1 2 H  VARIANCE  OF,5X,llH  STD  DEV  OF,5X,9H  THIESSEN, 5X, 

39H  THIESSEN, 5X,13H  THIESSEN  AVG/15X,5H  DATE  ,  3  X  , 

4 1 0  H  INTENSITY, 2X,12H  I  NT  ENS  I  T I E  S  ,  4  X , 1 2  H  INTENSITIES, 

5  5  X , 9H  DURATION,  4X,11H  TIME  DISTR,  5X,11H  TOT  PRECIP  ) 
90FCRMAT  (  25X,  5H  (IN),  7X,  3H  (SO  IN),10X,5H  (IN),  8X, 

13H  (HOURS ) ,6X,8H  ( HOURS ), 1 0 X , 5H  (IN)  //) 

100 FORMAT ( 1 3  X , 2 (  12, 1H/)  ,  I  2 , 5  X , 

1 32H  MISSING  DATA  FOR  STATION  NO  240,I3,3H  ON,lX» 

2  2 (  I  2 , 1 H/ ) , I2,2X»27H  COMPUTE  VARIABLES  BY  HAND.) 

110 FORMAT ( 1 3 X , 2 (  I2,1H/)  ,I2,5X, 

138H  ACCUMULATED  PRECIP  FOR  STATION  NO  240»I3,3H  ON, IX, 

22 ( 12 ,1H/ ) , I2,2X,27H  COMPUTE  VARIABLES  BY  HAND.) 
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1 20  FORMAT  { 1 3 X , 2 (  I  2 , 1 H/ ) , I  2 ,  4X,F6.4,  7 X , F 7 . 5 , 1 0 X , F7 . 4 ,  8X, 
1F6.1 ,8X,F6. 1 ,9X,F8.4) 

13  FORMAT  (2X» 14) 

OD  I  MENS  I  ON  THA  (  4  )  ,  THAN  (  7  )  , MSTANO  (  7  )  ,NWSHD  (  4  )  ,NYRP  (  15  )  , 
INMOP (13)  ,  NDAP ( 1 5 )  *  N  Y  R  B ( 15 )  »  NMOB ( 15 )  ,NDAB ( 15 )  ,NHR ( 15 )  * 

2N YR8B ( 15,7)  ,  NMOBB ( 15,7)  , ND ABB ( 15,7)  , NHRR (15,7)  , 

3  N  Y  R  B N ( 15,7)  ,NMOBN( 15,7)  , NDA3N ( 1 5 , 7 )  , N YR E ( 1 5 )  , NMOE (  15 )  , 

4 N D A E (15), NHRE (15), NNDA3 (15)  , NNDABN (15,7)  , NNDAE ( 15), 

5XN ( 1 5 ) , XSUM (15), XSUMSQ (15), SUMD A ( 1 5 ) , SUM A (15) 

ODIMENSION  5  U  M  A  P  (  15)  ,SS'JMXT(  15,7)  ,SUMXT(  15,7)  ,SUMPTA(  15)  > 
1SUMX ( 15,7)  , 5 SUM  X ( 15,7)  ,  L ( 2  4 )  ,  KKK (15)  ,SUMXSQ( 15 )  ,VAR ( 15  )  , 
2  5TDE V ( 15 )  , XBAR ( 15 )  ,DCA (15,7)  , D ( 15)  ,  PCA( 1 5 , 7 )  , TL (  1 5 )  , 

3PT ( 15 ) 

I  IK  =  0 

14  CONTINUE 

READ  1,  (NWSHD( J )  ,  J=i  ,4)  ,  I  I , NS , JJ  ,A 
C  READ  STORM  CONTROL  CARDS  FOR  A  WATERSHED  AND  CONVERT 

C  DATES  TO  DAY  OF  THE  YEAR. 

I  F ( A ) 159,159,15 

15  READ  2 , ( THA ( J )  , J  =  1  ,11) 

LL  =  I  I  +  JJ 

READ  2 , ( THAN ( J )  ,J  =  1  ,  LL  ) 

DO  68  KK  =  1  ,  LL 
68  READ  3  ,MST ANO ( KK ) 

DOl 68  K= 1 , N S 
DO  67  KK=1  ,  LL 

OREAD  4 , N  YR  P ( K )  ,NMOP(K)  , NDAP ( K )  ,N YR3 ( K )  , NMOB ( K )  ,  NDAB ( K )  , 
1NHR(K)  ,NYRBB(K,KK)  ,  NMOBB  (  K  ,  KK  )  ,  ND  AB3  (  K  ,  KK  )  ,  NHRR  (  K  ,  KK  )  , 
2NYRBN(K,KK)  , NM03N ( K , KK )  , ND ABN ( K , KK )  , N YRE ( K )  ,NMOE( K)  , 
3NDAE ( K ) , NHRE ( K ) 

N AA=NMOB ( K ) 

I  F ( NAA)  172,172,173 

173  GO  TO  (  16,17,18,19,20,21,22  ,23,24,25,26,27)  , NAA 

16  NNDAB (K)=NDAB(K) 

GO  TO  28 

172  NNDAB ( K ) =0 
GO  TO  28 

17  NNDAB ( K ) =NDAB ( K ) +3 1 
GO  TO  28 

18  NNDAB ( K ) =NDAB ( K) +59 
GO  TO  28 

19  NNDAB ( K ) =NDA8 ( K ) +90 
GO  TO  28 

20  NNDAB ( K ) =NDA3 ( K ) +120 
GO  TO  28 

21  NNDAB (K)=NDAB(K) +151 
GO  TO  28 
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22  NNDAB ( K 1 =NDA8 ( K 1 +181 
GO  TO  28 

23  NND-AB(K)=NDAB(K)+212 
GO  TO  28 

24  NNDAB ( K ) =NDAB ( K ) +243 
GO  TO  28 

25  NNDAB (K)=NDAB(K)+273 
GO  TO  28 

26  NNDAB ( K ) =NDAB ( K) +304 
GO  TO  28 

27  MNDAB(K) =NDAB(K) +334 

28  NBB  =N YRB ( K ) -62 

IF (NBB) 32 *32*174 

174  GO  TO  (32,29,32,32,32,29) * NBB 

29  I F( NNDAB (K) -60 132,30,30 

30  IF (NMOB(K) -2) 32,32,31 

31  NNDAB (K)=NNDAB(K 1+1 

32  NCC=NMOBN(K»KK) 

IF(NCC) 170,170,171 

171  GO  TO  ( 33 ,34,35 ,36,37,38,39,40,41  ,42 ,43 ,44)  ,NCC 

3  3  NNDABN ( K , KK 1 =NDABN (K  *KK) 

GO  TO  45 

170  NNDABN (K»KK)=0 
GO  TO  45 

34  NNDABN ( K,KK)=NDABN ( K ,KK) +3 1 
GO  TO  45 

35  NNDABN ( K  ,KK 1 -NDABN ( K ,KK ) +59 
GO  TO  45 

3  6  NNDABN ( K  *  KK ) =NDABN ( K , KK ) +90 
GO  TO  45 

37  NNDABN ( K,KK) =NDABN ( K ,KK) +120 
GO  TO  45 

38  NNDAB N(K»KK)=NDABN (K,KK)+151 
GO  TO  45 

3 9  NNDABN ( K , KK ) =NDABN ( K , KK )  +  1 8 1 
GO  TO  45 

40  NNDABN ( K , KK ) =NDABN ( K , KK ) +2 12 
GO  TO  45 

41  NNDABN ( K , KK) =ND ABN ( K , KK ) +243 
GO  TO  45 

42  NNDABN ( K,KK) = NDABN ( K ,KK) +273 
GO  TO  45 

43  NNDABN ( K , KK 1 =NDA3N ( K , KK 1+304 
GO  TO  45 

44  NNDABN(K,KK)=NDABN(K,KK)+334 

45  NDD  =NYRBN(K,KK)-62 

I  F ( NDD 149,49,175 


' 
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175  GO  TO  ( 49 ,46,49,49 ,49,46)  ,NDD 

4  6  IF  (  NNDABN  (  K. ,  K!<  )  -60  ) 49 , 47 , 47 

47  IF  (NMOBNtK, 100-2)49,49,48 

4  8  NNDABN ( K , KK ) = NNDABN <  K , KK ) +1 

49  NEE=NMOE(K) 

IF(NEE) 176,176,177 

177  GO  TO  ( 50 , 51 , 52 , 53 , 54, 55, 56 , 57,58 ,59 ,60 ,61 ), NEE 

50  NNDAE(K) =NDAE( K) 

GO  TO  62 

176  NNDAE(K) =0 
GO  TO  62 

51  NNDAE ( K ) =NDAE ( K ) +3  1 
GO  TO  62 

52  NNDAE(K) =NDAE(K)+59 
GO  TO  62 

53  NNDAE(K) =NDAE( K)+90 
GO  TO  62 

54  NNDAE{K)=NDAE(K) +120 
GO  TO  62 

55  NNDAE(K)=NDAE(K)+151 
GO  TO  62 

56  NNDAE(K)=NDAE(K)+181 
GO  TO  62 

57  NNDAE ( K ) =NDAE( K) +212 
GO  TO  62 

58  NNDAE(K) =NDAE( K) +243 
GO  TO  62 

59  NNDAE(K) =NDAE( K) +273 
GO  TO  62 

60  NNDAE( K ) =NDAE ( K) +304 
GO  TO  62 

61  NNDAE(K)=NDAE(K)+334 

62  NGG  =NYRE(K)-62 

IF(NGC-)  66,66, 178 

178  GO  TO  ( 66,63,66,66,66,63) , NGG 

63  IF (NNDAE( K ) —  6  0 ) 66,64,64 

64  I F ( NMO  E ( K ) - 2 ) 66,66,65 

65  NNDAE ( K ) =NNDAE ( K ) + 1 

66  XN ( K ) =0  • 

XSUM(K) =0. 

XSUM5Q ( K ) =0  • 

SUMDA(K)=0. 

5UMA ( K ) =0  • 

SUMAP ( K ) =0  • 

KKK ( K ) =0 
SSUMXT ( K ,KK ) =0. 

SUM  XT ( K , KK ) =0  • 
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SUMPTA ( K ) =0 • 

SUM X  (  K  , KK ) =0 • 

67  SSUMX ( K  »  KK ) =0 • 

168  CONTINUE 
K=1 
1  =  1 

69  XHR=0 . 

NC  =  0 
ZHR=0. 

NDA Y=0 

C  READ  AND  STORE  PERTINENT  RECORDING  PRECIP  DATA. 

70  NDDAY=NDAY 

READ  5,  NSTANO ,N YR  »  NMON  » NDATE , ( L { J )  ,  J  =  1 » 12  ) 

READ  5,  JSTANO,KYR,KMON,KDATE, (L( J) ,J=13,24) 

I F ( KMON-NMON ) 73  5  71 , 73 

71  I F ( KDAT  E-NDATE ) 73,72,73 

72  IF(KYR-NYR) 73,74,73 

73  PRINT  6 , NS  T  ANO  » N  MON , ND AT  E  »  N  Y  R 
GO  TO  159 

74  NR=0 

IF ( NST  ANO-MSTANO (  11)70,7  5,70 

75  GO  TO  ( 76,77,78,79,80,81 ,82 ,83,84,85,86,87)  ,NMON 

76  NDA Y  =  NDAT E 
GO  TO  88 

77  NDA Y=NDATE+3 1 
GO  TO  88 

78  NDAY  =  NDATE+  59 
GO  TO  88 

79  NDA Y=NDATE+90 
GO  TO  88 

80  NDAY=NDATE+120 
GO  TO  88 

81  NDA Y  =  NDAT  E+ 151 
GO  TO  88 

82  NDA Y=NDATE+ 181 
GO  TO  88 

83  NDA Y  =  NDAT E+  2 1 2 
GO  TO  88 

84  NDAY=NDATE+243 
GO  TO  88 

85  NDAY=NDATE+273 
GO  TO  88 

86  NDAY=NDATE+304 
GO  TO  88 

87  NDAY=NDATE+334 

88  NN YR=NYR-62 

GO  TO  ( 92 ,89,92,92,92,89)  ,NNYR 


, 
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89  IF ( N DAY- 60) 92  >90  >90 

90  I F ( NMON-2 ) 92 , 92 , 9  1 

91  NDA Y=NDA Y+ 1 

92  IF(NR) 135,93,135 

93  I F ( NDAY-NNDAB ( K ) ) 94, 160*94 

160  IF ( NYR-NYR8I K ) ) 70 , 103 , 70 

94  IF (NC-1 >95,101,95 

95  IF ( NDAY-NNDAB ( K) ) 70,70,161 

161  IF(NYR-NYRB( K) ) 70,96,70 

96  IF(NDAY-NNDAE(K) >98,97,98 

97  XNHRE=NHRE(K) 

SUB  =  XNHRE- 1 • 

GO  TO  100 

98  I  F  (  NDAY-fMNDAE  (  K  )  >99,99,107 

99  SUB=0 . 0 

100  XDA YS=NDA Y-NNDAB ( K ) 

NC  =  1 

ZHR=ZHR+24.*XDAYS-SUB 
GO  TO  103 

101  IF ( NDAY-( NDDAY+1 >>102,103,102 

102  XNDAY=NDA Y-NDDA Y- 1 
ZH R  =  ZFi R  +  2  4  •  *  XN D  A  Y 
XHR=XHR+24.*XNDAY 

103  NC" 1 

I  F ( NDA Y-NNDAE ( K )  >105,104,105 

104  JK=NHRE ( K ) 

GO  TO  114 

105  IF (NDA Y-NNDAE (KJ >106,106,107 

106  JK  =  2 4 

GO  TO  114 

107  IF(K-NS) 108,111 ,108 

10  8  IF ( NDAY-NNDAB (K  +  l >  )  109,110,109 

109  I  F ( NDAT  E-NDABS (K+l, I ) ) 128,110,128 

110  K  =  K  + 1 
ZHR=0.0 
XHR  =  0 » 0 
GO  TO  103 

111  IF( 1-11)112,128,112 

112  IF ( NDATE-NDABB ( 1,I+1))128,113,128 

113  K=1 
1=1  +  1 

GO  TO  103 

114  DO  1 2 7 J  =  1 , JK 

IF ( NDAY-NNDAB ( K  >  )  115,118,  115 

115  IF ( NDATE-NDABB (K  ,I)>120,116, 120 

116  I F ( J-NHRR ( K  ,1)  >117,120, 120 

117  ZHR=ZHR+1.0 
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GO  TO  127 

118  I  F ( J-NHR(K)  ) 127,119,119 

119  I F ( J-NHRR( K, I )  )  1  2  1  ,  1  20 , 1 2 0 

120  XHR=XHR+leO 

121  ZHR-'ZHR  +  1  e  0 
K1=L(U)/10000 
NKK=K 1* 10000 

J 1 = ( L { J ) -NKK ) /100 
U3  =  K 1 / 1 00 
KKKK=J3*100 
J2  =  K 1-KKKK 
IF(J1) 122,122,126 

122  I  F ( J  2 )  124,124,123 

123  ZZ=1. 

KKK ( K) =  K 
NOO=NDATE 
NOOO=NYR 
NOON=NMON 
NONN=NST  ANO 
GO  TO  128 

124  I  F ( J  3 ) 125  ,  125,12  7 

125  ZZ=0. 

KKK ( K) =K 
NOO  =  NDAT  E 
NOOO=N YR 
NOON.=  NMON 
NONN=NST  ANO 
GO  TO  128 

C  COMPUTE  SUMS  AND  SUMS  OF  SQUARES  FOR  RECORDING  DATA. 

126  N=( Jl-70  )  +  ( J2-70)*10  +  ( J3-70)*100 
X  =  N 

XXN=  X/ 1 00 • 

SUM  XT ( K  ,  I  ) =  S U M X T ( K  ,  I  ) +XXN*XHR 

SSUMXT  ( :<  ,  I  )  =SSUM  XT  (  K  ,  I  )  +  XXN*ZHR 

SSUMX ( K , I ) =SSUMX(K , I )+XXN 

XN ( K ) =XN ( K ) + 1 • 0 

XSUM(K) =XSUM( K )  +  X  XN 

XSUMSQ ( K ) =XSUMSQ ( K )+XXN**2 

SUMX ( K  ,  I ) =SUMX( K  ,  I  ) +XXN 

127  CONTINUE 
GO  TO  70 

128  IF(K-NS) 129,130,129 

129  K=K+ 1 

GO  TO  131 

130  1=1+1 
K=1 

131  IF(  I  —  I  I  —  1 )  69,132,132 


. 


n  n 
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132  K=1 
1  =  1 

READ  AND  STORE  ALL  PERTINENT  NON-RECORDING  DATA  FOR  ALL 
STORMS o 

133  READ  7 ,NSTANO,NYR ,NMON ,NDATE ,NDSUM 
M  M  =  I  I  +  I 

I  F ( NST  ANO-MST  ANO ( MM  )  )  133, 134, 133 

134  KK= I  1+ I 
NR  =  1 

GO  TO  73 

135  K 1 =NDSUM / 1 00 
NKK=K 1* 100 

J 1  =NDSUM-NKK 
K2  =K  1  / 1 00 
NKK 1 =K2* 1 00 
J2  =K  1 -NK.K 1 
J4=K2 / 1 00 
NKK2=J4*100 
J3=K2-NKK2 

I F ( Jl-20) 138,136,137 

136  X X N  =  0 • 0 
GO  TO  140 

137  N={ Jl-70 )+ ( J2-70 )*10+( J3-70 ) *100+ ( J4-70 ) *1000 
X  =  N 

XXN=X/100. 

GO  TO  140 

139  KKK ( K ) =K 
ZZ=0.0 
NOO=NDATE 
NOOO=N YR 
NOON=NMON 
NONN  =  NST  ANO 
GO  TO  144 

138  I  F ( J4-20 ) 140,136,137 

140  I F ( NDA Y-NNDABN ( K  »  KK )  ) 133,162 ,162 
162  I F ( NYR-N YRBN ( K  *KK  J  )  133 ,141 ,133 

141  I F ( NDAY-NNDAE ( K )  ) 142 ,143, 143 

142  IF ( Jl-20 ) 899,900,900 

899  IF ( J4-20 ) 139,900,900 

900  SUMX(K,KK)=SUMX(K ,KK)+XXN 
GO  TO  133 

143  IF( NDAY-NNDAE( K) ) 144,142, 144 

144  IF(K-NS) 145,146,145 

145  K=K+ 1 

GO  TO  147 

146  K=1 
1=1  +  1 
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C  COMPUTE  VALUES  OF  VARIA3LES  FOR  EACH  STORM. 

147  I F ( I - J J— 1)133*148*148 

148  DO  150  K  = 1 ,  N  S 
DO  149  1-1,11 
SUMXSQ ( K )=XSUM(K) **2 
I  F ( X  N ( K )  ' 149,14  9,404 

404  VAR ( K )=( XSUMSQ ( K ) -SUMXSQ ( K ) /XN(K) ) / ( XN  (  K  )  - 1 . 0  ) 

STDEV ( K ) =SQRT  F ( VAR ( K )  ) 

XBAR ( K ) = XSUM ( K ) / XN ( K ) 

I  F ( SSUMX ( K , I)  )  149 , 149 , 40  5 
40  5  DC A (K, I ) = SUM XT ( K ,  I  ) /SSUMX ( K , I  ) 

SU.YDA(K)=SUMDA(K)+DCA  (  K,  I  )*THA(  I  ) 

D ( K ) =SUMDA ( K ) /A 

P  C  A ( K , I ) -SSUMXT ( K ,  I ) /SSUMX (K, I ) 

SUMPTA ( K ) =SUMPT  A ( K  )  +PC  A  (  K  , I )*THA(  I ) 

TL  ( K ) -SUMPT  A ( K ) /A 

149  CONTINUE 

150  CONTINUE 
PRINT  200 

C  OUTPUT  TABLE  HEADINGS  AND  VALUES  OF  VARIABLES  FOR  ALL 

C  STORMS. 

PRINT  8,  ( NWSHD ( J )  »  J= 1  ,  4 ) 

PRINT  9 
IJl-II+JJ 
DO  152  K=1»NS 
DO  151  1  =  1  ,  I  JI 

151  SUMAP { K ) =SUMAP ( K ) +SUMX ( K , I )*THAN(  I  ) 

152  CONTINUE 

DO  157  K=1 , NS 
I F ( K-KKK ( K )  )  156,1  53,156 

153  IF(ZZ) 155, 154,155 

154  PRINT  10,  NMOP( K) ,NDAP(K) ,NYRP(K) , NONN , NOON , NOO , NOOO 
GO  TO  157 

155  PRINT  11,  NMOP(K) , NDAP(K) *NYRP(K) , NONN , NOON , NOO , NOOO 
GO  TO  157 

156  PT(K)=SUMAP(K)/A 

OPR  I  NT  12,  NMOP(K)  ,NDAP(K)  ,NYRP(K)  ,X3AR(K)  ,VAR(K)  * 
1STDEV ( K ) , D ( K ) ,TL(K) *PT(K) 

157  CONTINUE 

153  READ  13,  NSTANO 

IF(NSTANO) 158,  14,153 
159  CALL  EXIT 
END 
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APPENDIX  E 

CORRELATION  COMPUTER  PROGRAM 
FROM  COOLEY  AND  LOHNES 

INPUT  IS  ACTUAL  RAW  OR  TRANSFORMED  DATA  TO  BE  READ  IN  BY 
ROWS  FROM  A  MATRIX  WITH'  DIFFERENT  TRIALS  AS  ROWS  AND  VAR¬ 
IABLES  AS  COLUMNS.  ALL  VALUES  FOR  EACH  TRIAL  (ONE  ROW  OF 
THE  DATA  MATRIX)  ARE  READ  AND  TREATED  BEFORE  THE  NEXT  ROW 
IS  READ.  THIS  ELIMINATES  STORAGE  OF  THE  WHOLE  DATA  MAT¬ 
RIX.  INPUT  AND  OUTPUT  FORMAT  STATEMENTS  MUST  BE  CHANGED 
TO  ACCOMODATE  THE  DATA.  THE  FOLLOWING  VARIABLE  NAMES  ARE 
USED  IN  THE  PROGRAM.  THE  FIRST  FOUR  MUST  BE  READ  IN  ON 
A  CARD  IMMEDIATELY  PRECEEDING  THE  DATA  AND  COMPLYING 
WITH  THE  FIRST  FORMAT.  THE  REST  DEFINE  THE  OUTPUT. 

M=THE  NUMBER  OF  VARIABLES  TO  BE  CORRELATED  WITH 
EACH  OTHER. 

N=THE  NUMBER  0^  OBSERVATIONS  OF  ALL  M  VARIABLES. 

L  PUNCH  =  0  IF  CORRELATION  MATRIX  IS  TO  BE  PUNCHED. 

=1  IF  CORRELATION  IS  TO  BE  PRINTED  ONLY. 

LPRINT=0  IF  ALL  COMPUTED  PARAMETERS  AND  MATRICES  ARE  TO 
BE  PRINTED  AND  PUNCHED. 

=1  IF  ONLY  CORRELATION  MATRIX  IS  TO  BE  PRINTED. 
SX(I)=  THE  SUM  OF  ALL  VALUES  OF  A  VARIABLE  WHERE  I  IS 
TAKEN  FROM  1  TO  M. 

X ( I ) =  A  VALUE  OF  A  VARIABLE  WHERE  I  IS  THE  SUBSCRIPT 
OF  THE  PARTICULAR  VARIABLE. 

SS(I»J)=  A  MATRIX  ENTRY  WHICH  IS  EITHER  A  SUM  OF  THE 

SQUARES  OF  ALL  VALUES  OF  A  VARIABLE  OR  A  SUM  OF 
THE  CROSS-PRODUCTS  OF  ALL  VALUES  OF  TWO  VARIAB¬ 
LES. 

SSD(I,J)=A  MATRIX  ENTRY  IN  THE  DEVIATION  SUMS  OF  SQUARES 
MATRIX*  COMPUTED  FROM  THE  VALUES  IN  THE 
PREVIOUS  MATRIX. 

D  (  I  » J ) =  A  MATRIX  ENTRY  IN  THE  VARIANCE-COVARIANCE  MATRIX 
COMPUTED  AS  SSD (  I  * J ) / N- 1 . 

SD ( I ) =  A  STANDARD  DEVIATION  COMPUTED  AS  THE  SQUARE  ROOT 
OF  THE  VAR  I ANCES  »  D (  I  *  I )  ?  1  =  1, M 
R ( I  *  J ) =  A  CORRELATION  COEFFICIENT  IN  THE  CORRELATION 
MATRIX. 

OUTPUT  FROM  THE  PROGRAM  IS  LABELLED  THROUGH  OUTPUT  FORMAT 
STATEMENTS. 

OD I  MENS  I  ON  X ( 3  2 )  *SX(32)  ,SS (32  *32)  *SSD( 32  *32)*D(32*32), 

1 R ( 32  *32)  , S D ( 32 )  * XM ( 32 ) 

COMMON  X , M 

READ  1*  M,  N*  LPUNCH*  LPRINT 
1  FORMAT  (12,12,11*11) 

N  T  R  I  A  L  =  N 


non  n  no 
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C  COMPUTE  SUMS,  SUMS  OF  SQUARES,  SUMS  OF  CROSS-PRODUCTS 

DO  2  I  =  1,M 
SX { I ) =0 .0 
DO  2  J=  I  ,M 

2  SS (  I  ,  J ) =0  »  0 

3  READ  4 ,  (X(  I)  , I  =  1  , M ) 

4  FORMAT  (  (  10X  ,  10F7 .0 )  ) 

CALL  TRANFO 
DO  5  1  =  1 » M 
SX (  I  ) =SX(  I  )  +  X (  I ) 

DO  5  J  =  I  »M 

5  SS ( I  , J ) =  SS  (  I , J ) +  X (  I ) * X { J ) 

NTR I AL  =  NT  R I AL- 1 
XN  =  N 

SET  LOWER  TRIAGLE  EQUAL  TO  UPPER  TRIANGLE  AND  COMPUTE  THE 
DEVIATION  SUMS  OF  SQUARES  AND  CROSS-PRODUCTS  MATRIX 
IF  (NTRIAL)  6,6,3 

6  DO  7  1  =  1  ,  M 
DO  7  J=! ,M 

SSD ( I , J ) = ( XN*SS( I , J)-SX( I )*SX( J) ) /XN 
SS ( J  » I ) =SS (  I , J) 

7  SSD ( J , I ) =  S  SD  (  I , J ) 

COMPUTE  STANDARD  DEVIATIONS 
DO  8  1  =  1  ,  M 
XM (  I  ) =SX(  I  ) /XN 

8  SD ( I ) =SQRT F ( SSD (  I , I ) / ( XN-1 .0 )  ) 

PRINT  9 
PUNCH  9 

9  FORMAT  {  1H1,30X,27H  CORRELATION  PROGRAM  OUTPUT  /) 

OPTION  TO  PRINT  NUMBER  OF  VARIABLES , NUMBER  OF  OBSERVA¬ 
TIONS,  MEANS,  STANDARD  DEVIATIONS,  SUMS  OF  SQUARES, 
DEVIATION  SUMS  OF  SQUARES 
IF  (LPRINT)  13,10,22 

10  PRINT  11,  M , N 
PUNCH  11,  M  ,  N 

1 10  FORMAT  (  2  5  X ,  4H  FOR, 13,  15H  VARIABLES  WITH, 13, 

113H  OBSERVATIONS  //) 

PRINT  12 
PUNCH  12 

12  FORMAT  (  2  5  X ,  37H  MEANS  IN  THE  ORDER  0^  VARIABLE  INPUT  /) 
PRINT  1 3 , ' XM ( I } , I = 1 , M ) 

PUNCH  13, ( XM( I ) , 1=1 ,M) 

13  FORMAT { ( 14X,6F10.6) / ) 

PRINT  14 
PUNCH  14 

14  FORMAT  {  27X,  33H  STANDARD  DEVIATIONS  OF  VARIABLES  /) 
PRINT  15, ( SD(  I )  , 1  =  1  ,M) 
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PUNCH  15,  (SD(  I )  , 1  =  1 ,M) 

15  FORMAT ( ( 14X,6F10.6) / ) 

PRINT  16 

16  FORMAT  (  2  3  X  ,  35H  SUMS  OF  SQUARES  AND  CROSS  PRODUCTS, 
17H  MATRIX  /) 

17  FORMAT  ( 1 4  X  ,  4H  ROW , I  3/ ( 14X , 6F10 .4 ) / ) 

DO  18  1=1 , M 

18  PRINT  17,  I  ,  ( SS(  I  ,  J)  , J=1  ,M ) 

PRINT  19 

1 90 FORMAT  {  18X,  36H  DEVIATION  SUMS  OF  SQUARES  AND  CROSS, 
116H  PRODUCTS  MATRIX  /) 

20  FORMAT  (14X,4H  ROW , I  3 / ( 1 4 X , 6 F 1 0 . 4 ) / ) 

DO  21  1  =  1  ,M 

21  PRINT  20,  I  ,  ( SSD(  I , J)  , J=1 ,M ) 

C  COMPUTE  VARIANCE-COVARIANCE  MATRIX 

22  DO  23  1  =  1  , M 
DO  23  J=I  ,  M 

D  (  I , J ) =  S S D (  I , J ) / ( XN-1 . 0 ) 

23  D( J, I )=D( I ,J) 

C  OPTION  TO  PRINT  VARIANCE-COVARIANCE  MATRIX 

IF  (L PR  I  NT)  24,24,28 

24  PRINT  25 
PUNCH  25 

25  FORMAT  (  3 0 X ,  27H  VARIANCE-COVARIANCE  MATRIX  /) 

26  FORMAT  (14X,4H  ROW ,  I  3 / ( 1 4 X  ,  6 F 1 0 . 5 ) / ) 

DO  27  1=1 , M 

PUNCH  26,  I  ,  (D(  I ,J )  , J=1 , M ) 

27  PRINT  26,  I  » ( D(  I , J )  ,  J=1  ,M) 

C  COMPUTE  CORRELATION  MATRIX 

28  DO  29  1=1 , M 
DO  29  J=I , M 

R (  I,J)=D(  I  ,  J  )  / ( SD {  I  )*SD( J )  ) 

29  R ( J , I ) =R ( I , J ) 

C  OPTION  TO  PUNCH  CORRELATION  MATRIX 

PRINT  30 
PUNCH  30 

30  FORMAT  (  34X,  19H  CORRELATION  MATRIX  /) 

31  FORMAT  ( 1 4 X , 4 H  ROW , I  3 / ( 14 X  ,  6  FI  0 . 6 ) / ) 

DO  32  I  =  1  ,  M 

32  PRINT  31  *  I  » (R(  I »J )  ,J  =  1 ,M) 

IF  (LPUNCH) 33 ,33 ,35 

3  3  DO  3  4  1  =  1  ,M 

34  PUNCH  31 ,  I  ,  ( R(  I , J )  , J  =  1  ,M) 

35  CALL  EXIT 
END 


- 
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SUBROUTINE  TRANFO 
DIMENSION  X ( 3 2 ) 
COMMON  X  , M 
DO  5  1  =  1, M 
IF( 1-28) 2  ,  1  ,2 

1  X  (  I  )  =  X  (  I  )  +  3  2 . 0 

2  I  F  (  X  (  I  )  )  4 , 3 , 4 

3  X (  I  )  =  • 000001 

4  X ( I ) =AL0G1 0 ( XII)) 
3  CONTINUE 

RETURN 

END 
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API 


:ndix 


PRINCIPAL  COMPONENT  PROGRAM 


r  r  9  m 

THIS  PROGRAM 
THAT  BEGINS  WITH 


COOLEY  AND  LOHNES 

PERFORMS  PRINCIPAL-COMPONENT  ANALYSIS 
EITHER  THE  CORRELATION  OR  DISPERSION 


MATRIX/ 
ORDER  ARE 
E AS  1 1 v  BE 


rather  than  raw  scores,  matrices  up  to  6oth 

POSSIBLE /  BUT  THE  DIMENSION  STATEMENT  COULD 
MODIFIED  IF  LARGER  MATRICES  WERE  NECESSARY  * 


INPUT. 

CONTROL  CARD  i  CONTAINS  (COL  1-48)  =  PROBLEM  IDEN¬ 
TIFICATION  WHICH  May  3F  ALPHABETIC/  (COL  49-50)=SIZE  OF 
MATRIX (RlGHT-JUol IF1ED)  IF  M=0/  THEN  CALL  EXIT.  THE 
MATRIX  FOLLOWS  CARD  1  AND  IS  READ  IN  BY  A  SUBROUTINE 
WHICH  MUST  Be  stored  TO  the  RIGHT  OF  the  main  diagonal 
OF  R.  SUBROUTINE  HD  I  AG  IS  REQUIRED. 

OUTPUT . 

PRINTED  OUTPUT  INCLUDES  THE  ROOTS  QF  MATRIX/  NORM¬ 
ALIZED  ’VECTORS/  PERCENTAGE  OF  TRACE  ACCOUNTED  FOR  BY  THE 
ROOTS/  AND  THE  FACTOR  LOADINGS. 

DIMENSION  PR OB  (IF) 

DIMENSION  R(60'60)/S3(aQ/60) t  I Q ( 60  )  / FR ACT ( 60 ) / X ( 60 ) 
P-'TEOER  PR33 

reading  IN  7 he  control  CARD 
100  READ  (105/121)  PR03/M 
121  FORMAT ( 1 P  A  4 / IP) 

I F ( M )  1000/ 1000/210 

210  WRITE  (103/21  )PRGB 
WRITE  (106/2DPR0B 
21  FRRMAT( 1H1/24X/ 12A4///) 

W  R I T  E ( 102/25) 

W  R I T  E ( 106/25) 

25  FORMAT ( 3 0 X / 2 8 H  ORIGINAL  CORRELATION  MATRIX/) 

CALL  PHD  I  AG ( R/ M ) 

T  =  M 

CALL  HDIAG  (R/M/O/SS/NR) 

W R I T E ( 108/710)  NR 
WRITE! 106,710)  MR 

7100 FORM AT ( 1H0/22X/31H  NO.  OF  ROTATIONS  FROM  JACOBIAN/ 

1 13H  SUBROUTINE  =  /  13//) 

WRITE  (108/1) 

WRITE  (106/1) 

1  FORMAT  ( 2 2 X / 3 6 H  CHARACTERISTIC  ROOTS  OF  CORRELATION/ 
l7u  MATRIX/) 

'WRITE  (103/3)  (R(I/I)/  I=1#M) 
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WRITE  (  106/3)  (  R  (  I  /  I  )  /  I=l/M) 

3  FORMAT ( ( 1 4X/ 6F 1 Q 1 6 ) ) 

WRIT-E  (108/4) 

WRITE  (  106*4) 

4OF0RMA.T(  1H0/32X/24H  NORMALIZED  E I GE NVECT8RS/25X / 

138H  NOt£  THAT  VECTORS  ARE  WRITTEN  IN  ROWS/) 

5  FORMAT ( ( 1 4X/ 6F 1 Q  *  4 ) ) 

D8  6  J - 1  j»  M 

WRITE  ( 1 0 6 1 7  )  J/($SU/J)/  I  =  1/M) 

6  WRITE  (108/7)  J/($S(I/J)/  I  =  1/M) 

7  FORMAT ( 14X/ 7H  VECTOR/ I  3/ ( 1 4X/ 6F 1 0  *  6 ) ) 

DO  10  I « 1 / M 

10  F R A C T ( I ) =  ( R ( T /  I  )  /  T)*100*0 
WRITE  (108/11) 

WRITE  (100/11) 

11QF0RMAT ( 1 H Q / 1 5 x / 3 7 H  PERCENTAGE  OF  VARIANCE  ACCOUNTED  FOR/ 
1 20H  Bv  EACH  EIGENVECTOR  /) 

WRITE  (108/5) ( FRACT ( I ) /  I=l/M) 

WRITE  (106/5) (FRACT( I }/  I=l/M) 

DO  111  I  =2/ M 

111  FRACT  (I)  =  FRACT  (I)  +  FRACT  ( I  - 1 ) 

WRITE  (108/112) 

WRITE  (106/112) 

112  F 0 R M A T ( 1 H 0 / 2 2 X / 2 3 H  ACCUMULATED  PERCENTAGE/) 

WRITE  ( 108/5) (FRACT  (I)/  I  =  1/M) 

WRITE  ( 106/5) (FRACT  (I)/  I  s  1/M) 

WRITE  (108/113) 

WRITE  ( 106/ 113) 

1130 FORMAT ( 1 H 0 / 3 7 X / 1 4 H  FACTOR  M A TR I X/25X/ 1  OH  NOTE  THAT/ 

123H  FACTORS  ARE  WRITTEN  IN  ROWS  /) 

DO  12  I  *  1/M 

12  R(  1/  I  )  =  SORT ( R ( 1/  I  )  ) 

DO  13  U  =  1/M 

DO  13  I  =  1/M 

13  SS( I/U)  =  SS( I- J)  *  R( J/U) 

DO  14  J  =  1/M 

WRITE  (106/15)  J/ ( 3S ( I / J ) /  I  =  1  /  M  ) 

14  WRITE  (108/15)  J/(SS(I/J)/  I  =1/M) 

15  FORMAT( 14X/7H  FACTOR/ I3/( 14X/6F13«6)  ) 

GO  to  100 

1000  call  fxit 

END 


non 
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SUBROUTINE  RHDlAG(RiM) 

DIMENSION  R ( 60, 60 ) 

1  FORMAT  (14X,4H  ROW , I  3/ ( 14X, 6F10# 6 )  ) 

2  FORMAT  (  (  14X,  6F10«6  )  ) 

DO  3  I  =  1,M 

READ  2/  (R(I, J), J=1,M) 

3  PRINT  1,  It  (R( I , J ) , J= 1, M ) 

RETURN 

END 

SUBROUTINE  HD  I  AG, 

PROGRAMMED  BY  F , J .  C0RBAT6  AND  M#  MERWIN  OF  THE  M # 

I  •  T  *  COMPUTATION  CENTER* 

THIS  SUBROUTINE  COMPUTES  THE  EIGENVALUES  AND  EIGEN¬ 
VECTORS  OF  A  REAL  SYMMETRIC  MATRIX,  H,  OF  ORDER  N  (WHERE 
N  MUST  BE  LESS  THAN  61 ),  AND  PLACES  THE  EIGENVALUES  IN 
THE  DIAGONAL  ELEMENTS  OF  THE  MATRIX  H,  AND  PLACES  THE 
EIGENVECTORS  (NORMALIZED)  IN  THE  COLUMNS  OF  THE  MATRIX  U 
IEGEN  IS  SET  AS  1  IF  ONLY  EIGENVALUES  ARE  DESIRED,  AND 
IS  SET  TO  o  WHEN  VECTORS  ARE  REQUIRED*  NR  CONTAINS  THE 
NUMBER  OF  ROTATIONS  DONE* 

SUBROUTINE  HD  I  AG  ( H, N, I EGEN, U, NR ) 

DIMENSION  H(60,60),U(60,60),X(60), IQ (60) 

I F ( IEOEN)  15,10/ 15 

10  DO  14  1=1, N 
DO  14  J  =  1  / N 
IF(I-J)  12,11,12 

11  U  (  I ,  J  )  =  1  •  C 
GO  TO  14 

12  U ( I , J  5  =  0 • 

14  CONTINUE 

15  NR  s  0 

IF  (N-l)  1000, 10 CO, 17 

SCAN  FOR  LARGEST  OFF-DIAGONAL.  ELEMENT  IN  EACH  ROW 
X ( I  5  CONTAINS  largest  ELEMENT  in  ITH  row 

10(1)  HOLDS  SECOND  SUBSCRIPT  DEFINING  POSITION  OF  ELEMENT 
17  N  M 1 1 =  N •  1 

DO  30  1  =  1, NM  1 1 
X( I)  =  0* 

I  PL  1  =  I  +  1 
DO  30  J s ! P L 1 , N 

I F (  X(I)  -  ABS (  H (  I , J )  )  )  20,20,30 

20  X ( I ) =  ABS ( H ( I , J)  ) 

IQ( I )«J 
30  CONTINUE 

SET  INDICATOR  FOR  SHUT  -  OFF . R APs 2  *  * - 27 / NR  =  NO *  OF  ROTATIONS 


on  n  n 


i 
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RAP»7#430580596E*9 
HDTEST  s 1 ♦ 0E38 

FIND  MAXIMUM  OF  X ( I )  S  F8R  P I V  0  T  ELEMENT  AND 
TEST  FflR  END  OF  PROBLEM 
40  DO  70  Isl/NMIl 

IF  (I-l)  60/60/45 

45  IF  (  XMAX**  X(  I)  )  60/70/70 

60  XMAX  =  X(  I  ) 

IPIV=I 
J  P  I V  s  I G  (  I  ) 

70  CONTINUE 

IS  MAX  •  X  < 1 )  EQUAL  TO  ZERO/  [F  LESS  THAN  HDTEST/  REVISE 
HDTEST 

IF  (  XMAX)  1000# 1000/80 
8 0  IF  (HDTEST)  00/90/85 
85  IF  (XMAX  -  HDTEST)  90/90/148 
90  HD  I  MIN  5*  ABS  (  H(l/1)  ) 

DO  110  1=  2/N 

IF  (HOIMIN-  ABS (  H(I/I)))  110/110/100 
ICO  HDIMIN-ARS(H(  1/  I  )  ) 

110  CONTINUE 

HDTESTsHDIMIN*RAP 
C  RETURN  IF  MAX « H ( I / J ) LESS  THAN < 2**-27 > ABS (H ( K/K ) -MIN) 

IF  ( HDTEST -  XMAX)  148/1000/1000 
148  NR  =  NIR  + 1 

C  COMPUTE  TANGENT/  SINE  AND  COS I NE # H ( I / I ) , H ( J/ J ) 

150  T  ANG  =  S I GN ( E  *  0/ ( H ( I P I  V/ I P I  V ) -H ( JP I V / JP I V ) ) ) *H ( I P I V/ JP I  V ) / 
1 <  A  5  $ ( H ( 1^1 V/ IPIV)«H( JPI V/ JPIV) ) +SQRT (  ( H ( I P I  V / I P I  V ) -H ( 
2JPIV/  JPIV) ) **2  +  4 «C*H( IPIV/ JPIV >**2>  > 

COS  I NE  =  1  *  O/SQRT ( 1.0+TANG**2) 

S I  ME  =  T  A NG  *C6S I NE 
HI I»H( IPIV/ I P I  V ) 

H( IPIV/ IPIV) sCBSINE**2* (HI I  +  TANG*(2**H( I P I  V/ JP I V ) +T ANG*H 
1 (JPIV/ JPIV)  )  5 

H( JPIV/  JaI V) = COSINE* *2* (H( JPIV/ JPIV) -TANG* (2»*H( I P I  V/ JPI 
1 V  )  -  T  A  N  G  *  H  I  I  )  ) 

H ( IPIV/ JPIV) =0. 

C  PSEUDO  RANK  THE  EIGENVALUES 

C  ADJUST  SINE  AND  COS  F8R  COMPUTATION  OF  H(IK)  AND  U ( I K ) 

IF  (  H ( I P I  V/ IPIV)  -  H( JPIV/ JPIV)  )  152/  153/  153 

152  HTEMP  s  H( IPIV/ IPIV) 

H ( IPIV/ IPIV)  =  H( JPIV/ JPIV) 

H( JPIV, JPIV)  =  HTEMP 
C  RECOMPUTE  SINE  AND  COS 

HTEM°  =  SIGN  ( 1 ♦ 0/ -S I NE  )  *  COSINE 
COSINE  =  ABS  (SINE) 

SINE  =  HTEMP 


. 


. 


o  n  o 
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153  CSNT  I  K!UE 

INSPECT  THr  jnS  BETWEEN  1+1  AND  Nl-l  10  DETERMINE 
whether  a  new  MAXIMUM  value  should  BE  COMPUTED  SINCE 
THE  PRESENT  MAXIMUM  IS  IN  THE  I  OR  J  ROW* 

DO  350  I  s 1  *  N M 1 1 
I  F (  I  -  IP  IV  >  310.350/300 

200  IF(I-JPIV)  210/350/210 
210  IE  (IQ(I)-IPIV)  230/240/230 

220  IE  MO(I)-JPIV)  350/240/350 

24Q  K -  I  0 ( I  ) 

250  HTEMPsH ( I #  K ) 

H  (  1/0*0* 

I  PL  1  ” I +1 
X  (  I  )  -  0  * 

C  SFARCH  IN  DEPLETED  R&W  FOR  NEW  MAXIMUM 

DO  320  J=IPL1/N 

IF  (  X  C I ) - ABS  <  H ( I / J ) )  )  300/300/320 
300  X( I )  =  A  3  S  (  H  ( 1/ J) ) 

I Q  (  I  )  =  J 
320  CONTINUE 

H (  I  / K ) sHTEMP 
350  CONTINUE 

X  (  I  P  I  V  )  =  0  e 
X(JPIV)  =0 • 

c  change  the  other  elements  of  h 

DO  530  I = 1 / N 
I F ( I  -  I P I  V )  370/530/420 

370  HTEMP  s  H( 1/ IP  I  V) 

H(I/IPIV)  =  COS  I  NE*HTEMP  +  S  I  NE*H ( I / UP  I  V ) 

IF  (  X  (  I  )  -  ABS(  HU/IPIV))  )  380/  390/390 
380  X (  I )  =  ABS ( H ( 1/ I P I  V  >  ) 

10(1)  s  IPI V 

390  HU/JPIV)  =  -SINE*HTEMP  +  COS I  NE*H  (  I  /  JP I  V  ) 

IF  (  X  {  I  )  -  ABS  (  HU/JPIV))  )  400/530/530 
400  X  (  I  )  *  ABS  (HU/  JPIV)  ) 

I  0 ( I )  ~  JPIV 
GO  TO  530 

420  IF  (I-JPIV)  430/530/480 
430  HTEMP  s  H (  I P I  V / I ) 

HUP  IV/ I)  =  COS  I NE  *HTEMP  +  S I  NE*H  (I  /  UP  I  V  ) 

IF  (  X(IPIV)  -  ABS(  HUPIV/D)  )  440/450/450 
440  X(IPIV)  =  ABS ( H ( I P I V/ I )  ) 

I  0  (  IPIVUI 

450  HU/JPIV)  -  -S  I  NE*HTEMP  +  COS  I  NE  *H  (  I ,  JP  I  V  ) 

IF  (  XU)  -  ABS  (  HU/JPIV))  )  400/530/530 
480  HTrMD  s  H ( IPI V# I ) 

HUPIV/I)  =  C  0  S  I N  E.  *  H  T  E  M  P  +  S  I  NE*H ( JP I  V/  I  ) 


. 

I 
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IF  (  X  (  I P  I  V  )  A 5 S  (  H  (  I  P  I V /  I  )  )  )  490/500/500 

490  X( I P  I  V )  =  ABS(H( IPIV/ I )  ) 

I  Q  (  I  P  I  V  )  =  I 

500  H(JPIVil)  =  -SINE*HTEMP  +  CDS  I NE*H { JP I  V/ I ) 

IF  (  X(JPIV)  -  ABS(  H(JPIV/D)  )  510/530/530 
510  X( JPIV)  «  ABS ( H ( JP I  V/ I )  ) 

IQ(JPIV)  «=  I 
530  CONTINUE 

C  TEST  FOR  COMPUTATION  OF  EIGENVECTORS 

IF( IEGEN)  40,540#4Q 
540  DO  550  I-l/N 

HTEMP  =  U ( I / IPIV) 

U(  1/  IPIV) sCeSlNE*HTEMP  +  SINE*U( I/JPIV) 

550  U( I / JP IV) = - S I NE*HTEMP  +  C9S I NE  *U ( I/JPIV) 

GO  T9  40 
1000  RETURN 
END 


. 
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C  APPENDIX  G 

C 

C  VAR  I  MAX  ROTATION  PROGRAM 

C  FROM  COOLEY  AND  L6HNFS 

C  THIS  PROGRAM  WILL  PERFORM  VAR  I  MAX  ROTATION  FOR  UP  TO  75 

C  FACTORS  AND  100  VARIABLES, 

c 

C  INPUT* 

C  ENTER  NEIG  (COLS  59  -  60)* 

C  CARD  1  CONTAINS  PR9B(C.C*  1-43)=  PROBLEM  I  DENT  I F I  - 

C  CATION  WHICH  MAY  BE  ALPHABETIC/  N  (COL  1-48) =  NO,  OF 

C  VARIABLES/  IF  N  =  C/  CALL  EXIT/  L  (COLS  52-53)  =  NO*  OF 

C.  FACTORS/  N VALUE  (COLS  54-55)  =  NO*  OF  ITERATION  CYCLES 

C  (USE  00  IF  THIS  TEST  IS  NOT  DESIRED)  JTFST  (COL  56-57)  = 

C  N*.  OF  CYCLES  New  VARIANCE  required  to  equal  old  VARI" 

C  A N C E >  R6PC  (C0L53)  -  0  IF  AN  INPUT  CARD  CONTAINS  THE 

C  LOADINGS  FOR  A  VARIABLE  (MATRIX  IS  IN  CARDS  AS  ROWS) 

C  R3RC  =  1  IF  AN  INPUT  CARD  CONTAINS  THE  LOADINGS  FOR  A 

C  FACTOR  (MATRIX  IS  IN  CARDS  AS  COLUMNS). 

C  IF  INPUT  DATA  IS  FROM  EIGENVECTORS  and  NOT  FACTORS/ 

C  NEIG  =  0/  OTHERWISE  NEIG  =1*  I r  NEIG  IS  0/  ENTER 

C  EIGENVALUES  in  the  ORDER  of  RESPECTIVE  EIGENVECTORS  ON 

c  card(S)  which  follow  the  above  data. 

C  LOADINGS  FOLLOW  CARD  1* 

DIMENSION  A (  00/ PC ) / V ( 50 ) / T V ( 50 ) / H (  50)/AX(  50/20) 

1/HN (  50 ) / HD (  50)  / PROD ( 12) / E I GVAL ( 20 ) / AY (  50/  20) 

COMMON  A/ V/ TV/ NV/N/L/FN/T/B/P 

1001  FORMA7 ( 17HQN0 •  VARIANCE) 

1002  FORMAT (IN  I  3 / r  2  C  *  8 ) 

1003  FORMAT C40X/16HQ  FACTOR  MATRIX) 

1004  FORMA7 (9W0VARI ABLE  14/) 

1005  FORMAT ( 1 H  1 0 F 1 1  *  4 ) 

1006  FORMAT ( 1?A4/ 13/312/ 11/ 12) 

1007  FORMAT (5 F 14*7) 

1003  FORMAT (27H00LD  HE  NEW  H2  DIFFERENCE/) 

1 CC9  FORMAT (F6,3/FB*3/E 12*3/ IS) 

1010  F  0  R  M  A  T (  ( 1 0  X  >  7  F 1 0  *  6 ) ) 

10120F0RMAT ( 1H1/26X/ 37 H  OUTPUT  FROM  V A R I M A X  ROTATION  PROGRAM/ 

1  /  2  0  X  /  1 2  A  4  /  /  ) 

1C130F9RMAT (21X/37H  EIGENVALUES  FROM  PRINCIPAL  COMPONENT/ 

19H  ANALYSIS  //( 14X/6F10.6) ) 

1 C 1 4 0 F 0 R M A T ( 1 H 0 /  14X/35H  NO*  ROTATIONS  REQUIRED  TO  MAXIMIZE/ 

1 2CH  VAR  IN AX  CRITERION  =  /I3) 

1015  FORMAT ( 12A6) 

1016  FORMA.T(  1HQ/33X/22H  ORIGINAL  EIGENVECTORS/) 

1017  FORMAT ( 1 4  X / 7 H  VECTOR / I  3/ ( 1 4X / 6F 1 0  *  6 )  ) 

1018  FORMAT ( 1H0/J3X/ 23H  ORIGINAL  FACTOR  MATRIX/) 


141 


1019  FORMAT  (  14X/ 7M  FACTOR.?  I  3  /  (  14X/6F10*6)  ) 

1020  FORMAT { 1HC/30X/28H  FINAL  ROTATED  FACTOR  MATRIX/) 

1021  FORMAT/ 1W0/32X/24H  VARIANCE  QF  EACH  F  ACTOR// (  1  AX;  6F  1 0  *  6  )  ) 
1 0  22  0  F  G  R  M  A  T  (  1 H  0  /  1 7  X  /  3  7  H  PERCENTAGE  OF  VARIANCE  ACCOUNTED  FOR/ 

115H  BY  EACH  F A C T 0 R / /  (  1,4X/6F10*5)  ) 


1023  FORMAT! 1H0/22X/23H  ACCUMULATED  PERCENTAGE// <  HX# 6F10i 4 ) ) 
1 0  2  4  0  F  O  R  M  T ( 1 H  0  /  2  9  X  / 3 0 H  REDUCED  FINAL  ROTATED  F ACTORS/1 4X/ 

19H  VARIABLE/  21X/8H  FACTORS/) 

FORMAL! 1 8 X  # I  2/ 3X/ 1 GF5  «  2  ) 

F  0  R  M A T ( / 1 4  X /  9  H V ARIA  N C  E  S  ,  10F5* 1 ) 


1025 

1026 
1027 

24 


FORMAT !  ! 1 4 X / 6 F 1 0  *  6 ) ) 


READ! 105/  1006) !PROB( I ) / 1=1/12)/ N/L/ NVALUE/ JTEST / NR6RC/  NE I G 
RORC  =  N R  0  R  C 
I F { N )  6000/ 6000/ 25 

25  WRITE  (  10P=/1012)/PR6B 
WRITE  (106/1012)/ P ROB 


IF  (RORC)  26/26/28 

26  D0  27  I  =  1/N 

27  REAM ! 105/ 1010) ( A ( I / J) / J= 1/L ) 

GO  TO  29 

28  DR  20  J=  1  / L 

20  READ! 105/  1027)  ( A (I / J ) / I  =  1 / N ) 

29  IF (NE IG)31/31?30 

31  READ (105/ 1010) ( E I  OVAL ( J ) / J=  1  * L ) 
WRITE! 108/  1013)  ( E I GV AL ( J ) / J=  1  / L  ) 
WRITE! 106/ 1013) ! E I  OVAL ( J ) / J= 1/ L ) 
D  R  32  j  5 1 / L 

32  EIGVAL! J) =SQRT(EIGVAL( J) ) 


WRITE! 108/ 1016) 

WRITE! 106/ 1016) 

DO  34  Jsl/L 

WRITE! 106/ 1017) J/ (A( I/J )/ I=l/N) 

34  WRITE! 108/ 1017 ) J/ (A! 1/ J) / I* 1/N) 
DO  33  J ~ 1 / L 

DO  33.  I  sr  1  /  N 

33  A! 1/ JJsEIGVAU J)*A( 1/ J) 

WRITE! 108/  1018) 

WRITE! 106/  1018) 

DO  35  J - 1 j  L 

WRITE! 106/ 1019) J/ ( A! 1/ J)/ I=l/N) 

35  WRITE! 103/ 1019) J/ ! A! 1/ J)/ 1=1/ N) 
30  ERS«0»00116 

NC  s  0 

tv!  n  =o.o 

L  L  =  L  -  1 

N  V  s  1 
FNsN 


* 

* 
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CPNS  =  1  * O/SGRT ( 2  e  0 ) 

DO  3  Isl/M 

3  H  (  I  )  =  0  *  0 
DO  4  Isl/N 
n  o  4  j  s  i  *  i. 

HU)  aH(  I  )  +  A  (  I,  J)*A(  I  /  J) 

4  CONTINUE 

D 0  5  Is  1  / N 
H ( I ) =5QST (H(  I  )  ) 

DO  5  JsliL 

5  A(  1/  JU4(  Ij  J)/H(  I  ) 

232  CALL  VA.'UAX 

102  L C Y C L E  s N V  »*  2 
DO  6  UltH 
DO  6  Js 1 / L 

6  AX(  I  /  J  )  2  A  (  I  /  ,J )  *  H  (  I  ) 

LVsNV- 1 

IF (NV-50)  9/999/999 

3  I F { LV-N VALUE )  10/999/ 10 

10  IF(JTCST)  13/13/16 

16  ! F ( ABS ( TV ( NV )-TV(LV))-0* 0000001 )  11/  11/  13 

11  NC-NC+1 

12  I F ( N C - J T L 3 T )  13/ 099. 099 

13  09  500  J  = 1 / L L 
I  I  s J-U 

DO  300  Ks I I/L 

AA  =  Q  ,0 

85=0*0 
CC=0*0 
D  D  s  0  •  0 
DO  15  Isl/N 
XX  =  A( 1/ J) 

Y V  =  A ( I/X) 

U=(XX+YY)*(XX-YY) 

W  s  2«0*XX*YY 
CCbCC+(U+W)*(U'W) 

DD  =  DD  +  2 « Q*U*W 
A  A  s  A  A  +  U 
B  B  =  8  B  +  W 
15  CONTINUE 

T*DD-2#0*AA*BB/FN 
B  =  CC-(AA*-*2"BB**2)/FN 
P=0«25*ATANF (T/B) 
jAN4P=T/3 

IF ( )  1041/1433/1042 

1433  IF ( T  +  B - £  P  5 )  500/  104  3/  1043 
1043  C 0 S 4 T  =  C 6 N S 


# 

- 


I 
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SIN4T-C.6M5 
GO  TO  5000 

1041  T  A N.  4  T  -  A  B 5  (  T  } /A.BS  (  B  ) 

IF ( TAN4T-EPS )  8000/ 1 ICOj 1100 

.1100  C?S4T=1 .0/SGB'T  (  1  •C+TAN4T**2) 
SIN4T=TAN4T*CBS4T 
GO  TO  5000 

8 COO  I F ( B  5  1150/500/500 

1150  SINP=C0NS 
CPSP=CPNS 
GO  TO  1000 

1042  CTN4T=ABS(T)/AB3(B) 
IF(CTN4T-EPS)  0000/ 1200/1200 

1200  S I N4T  =  1  *  O/SQRT ( 1 * 0  +  CTN4T **2 5 
C  9  S  4  T  =  C  T  \'  4  T  *  S I N  4  T 
GO  TO  5000 
9CC0  CO S4T-Q.0 
SIN 4 T= 1*0 

5000  Cr‘S2T~SCRT  (  (  1  •  0  +  CPS4J  ) /2  *  0  ) 

S I N  2  T  =  S I N  4  T / ( 2*C*CBS2T ) 

C0ST  =  SC3P.T  (  ( 1 .0  +  C0S2T  )/2.0) 
SINT  =  SIN2T/(2. ONCOST  ) 

IF (B) 1250/  1250/  1300 
1300  COSP-CeBT 
S I NP  =  3  I  NT 
GO  TO  70nQ 

1250  ceSP=C5NS*C9ST+C0NS*SINT 

$!NP  =  ABS(  CONS *COST -CONSENT  ) 
7000  IF { T ) 1400/ 1400/ 1000 
1400  SINP="SINP 
1000  X ~ C 0 0 p 

y*sinp 

DO  100  I  ” 1  /  N 

A  I J  sA(  1/ J)*X+A< I/K5*Y 
AIK  » - A ( I / J ) *  Y  +  A ( I / K ) *X 

A  (  1 1  J  )  s  A  I  J 
100  A ( I / K ) -  A  I  K 
500  CONTINUE 
GO  TO  222 
299  DO  301  I  =  1  / N 
301  HN( I ) =0.0 

DO  303  I  =  1  /  N 
H  (  1  )  =  H  (  I  )  *  H  (  I  ) 

DO  30?  J  =  1 / L 

3  02  HM( I ) =HN( I ) +AX ( I / J  5  *  AX ( I/J) 
303  HD ( I )=HM( I ) - H ( I ) 

W  R I T  E ( 108/ 1014JNV 
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WRITEt  106/  101MNV 
WRITE  (  103/  1020  ) 

WRITEt 106/ 1020) 

XNsN 

DO  305  J  »  1  /  L 
H  (  J  )  -  0  e  0 

W  R I T  E ( 108/ 1019 ) J/ ( AX ( I / J ) / I  =  1  /  N  ) 

WRITEt 106/ 1019) J/ t AX ( I/J)/ I =1/N) 
DO  305  I  a  1  / N 
AY { I  /  J  )  *  A  X  t 1/ J)**2 
H( J)=H( J)+AY( I  /  J) 

HN ( J )  =  ( H ( J ) *  100  » )/XN 
305  CONTINUE 

WRITE( 103/ 1021 ) (H( J)/ J=l/L) 

WR I TE ( 1 06/  1021) (  H  (  J  ) ./  J  =  1  /  L  ) 

WR I TE (108/  1022) ( HM  J  )  /  J  =  1  /  L  ) 
WRITEt  106/  1022)  (Hi'  (  J)/  J  =  l/L) 

DO  306  J-2/L 
3C6  HN( J)=HN( J)+HN( J«1  ) 

W RITE ( 108/  1023) (HN < J ) / J» 1/ L ) 

W  R I T  E ( 106/  1023)  (HN(J)/J  =  1/L) 

WR  I  TE ( 103/  1020) 

WR I TE ( 106/ 1020) 

WRITEt  108/ 1024) 

WRITE ( 106/  1024) 

DO  307  I  «  1  / N 

WRITEt 106/  1025) 1/  t  AX  < I/J)/Jsi/L) 
307  WRITEt 108/ 1025) I / ( AX ( I / J ) / J = 1 / L ) 
WRITEt 108/  1026) ( H ( J ) / J  = 1 / L ) 
WRITEt 106/  1026)  ( H ( J ) / J -  1 / L ) 

GO  TO  24 
6000  CALL  EXIT 
END 
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SUBROUTINE  VARMaX 

DIMENSION  A {  50# 20 ) # SA < 2Q ) # SA2 ( 20 ) # V < 50 ) / TV ( 50 ) 

COMMON  A / V/ TV, N V / N » L#  FN# T.»  B  j  P 

svsOtO 

NV=NV+1 

DO  6  J s 1 #  L 

S  A  (  J  )  »  0  •  0 

SAB ( J ) »0 • 0 

6  CONTINUE 
DO  8  J-4l.iL 
DO  7  I s  3  »  N 

SA( J)*SA( J)+A{ I#J)*A( I# J) 

7  SA2(JIsSA2(J)  +  (A( I / J ) *  A ( I # J ) ) **2 
S  A  (  J  )  =  S  A  (  J  )  *  *  B 

8  V( J)=(FN*SA2(J)-SA(J) )/FN**2 
00  9  J s 1 j  L 

9  SVbSV+V ( J ) 

TV ( NV 5  sSV 

RETURN 

.END 
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APPENDIX  H 


MEANS  AND  STANDARD  DEVIATIONS  OF  VAR  I ABL 

ES 

VARIABLE 

RAW 

DATA 

transformed  DATA 

MEAN 

STD.  DEV. 

MEAN 

STD.  DEV. 

A 

2  1 . 846298 

16.251465 

1.254370 

0.251140 

SHP 

2.981933 

0.648277 

0.465232 

0 .088734 

AZ 

205.619995 

57.481766 

2.295727 

0. 125083 

ELEV 

3496.000000 

619.873047 

3 .536823 

0.077241 

GNDS 

0.044714 

0.025749 

-  1 .43483  1 

0.285002 

GNDL 

0 . 669614 

0.240575 

-0.209380 

0. 187961 

FREQ 

12.685501 

6.949473 

1  .033  136 

0.256508 

L 

12.912598 

5.338099 

1 .072739 

0.  178872 

S 

0.007940 

0.004395 

-2.135564 

0. 152279 

USE 

13.959315 

20.957428 

0.764478 

0.584413 

I NFR 

0.096311 

0 . 003343 

-1.016577 

0.015277 

POND 

0.018874 

0.009896 

-2.017157 

1 . 049355 

I 

0.088012 

0.034093 

-1.247360 

0.4265  18 

ISD 

0.103150 

0. 109601 

-1 .282492 

0.569079 

D 

51.339890 

83. 787094 

1 .326674 

0.581200 

TDF 

3.995995 

6.934675 

-1.047191 

2 . 702044 

T  PC  P 

1 .367451 

1 . 546431 

-0.091693 

0.462670 

API 

0.354819 

0.347613 

-0.723012 

0. 565864 

SOLM 

14. 580000 

3.881351 

1.147837 

C . 121640 

WD  IR 

4. 1 79998 

1.913325 

0 .570156 

0.223902 

WEEK 

2 1 . 119980 

7.272090 

1 .293948 

0. 174260 

AIRT 

46.359482 

17.613785 

1.619226 

0.232972 

ATSD 

10.277411 

4.350557 

0.970650 

0.201078 

WVEL 

10.679915 

3 . 323439 

1 .001662 

0. 156743 

WVSD 

6.786855 

1 .355226 

0.816315 

0. 116398 

SOLT 

50.308075 

17.324463 

1 .670814 

0. 171252 

ST  SD 

4. 549333 

2.450234 

0.604456 

0.219946 

DEGD 

56.365906 

20.872681 

1.710973 

0.206981 

SWEQ 

0.245000 

0.408353 

-3.738943 

2.307280 

QMAX 

147.221939 

289.630127 

1.781961 

0.546014 

RUNE 

0.179998 

0.313125 

-  1 .169482 

0.629060 

. 
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